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FUNGAL GENE CLUSTER ASSOCIATED 
WITH PATHOGENESIS 

Cross-Reference to Related Applications 

5 This application claims the benefit of the filing date of U.S. application 

Serial No. 60/252,649, filed on November 22, 2000, and U.S. application Serial 
No. 60/252,732, filed November 22, 2000, under 35 U.S.C. § 1 19(e), the 
disclosures of which are incorporated by reference herein. 

10 Statement of Government Rights 

The present invention was made with support from the United States 
Government (grant No. 96-35303-3198 from the USDA/NRI). The United 
States Government may have certain rights in the invention. 

15 Field of the Invention 

The present invention relates to DNA molecules comprising fungal, e.g., 
Cochliobolus heterostrophus, genes from a peptide synthetase gene cluster, e.g., 
an iron reductase and/or a permease or major facilitator superfamily transporter, 
and uses thereof. 

20 

Background of the Invention 

There are approximately 30 species included in the genus Cochliobolus, 
nearly all of which are pathogens of wild grasses or cereals (Y oder et al., In: The 
Mvcota Vol. 5: Plant Relationships, Part A . Berlin: Springer- Verlag, Carroll, 

25 eds., pp. 145-166 (1997)). Cochliobolus heterostrophus represents the most 
widely distributed species in the genus and can be found in many tropical and 
subtropical areas in the world. As a natural pathogen of corn, C heterostrophus 
causes a disease frequently called leaf spot of maize in the old literature 
(Drechsler, J. Agr. Res. , 31:701 (1925); Drechsler, PhvtopathoL , 24:953 (1934); 

30 Yu, "Studies on Helminthosporium maydis" 36:327 (1952)). In the United 

States, C. heterostrophus is usually found in the warmer southern states, thus, the 
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disease is commonly known as Southern Corn Leaf Blight (Hooker, Ann. Rev. 
Phvtopathol. , 12:167 (1974)). For many years, Southern Corn Leaf Blight was 
only known as an endemic disease and was not considered to be major economic 
importance in the United States. But in 1970 5 it suddenly broke into a severe 
5 epidemic that destroyed 15% of the U.S. com crop and caused losses estimated 
at more than $1 billion. This serious damage made Southern Corn Leaf Blight 
one of the most widely known crop diseases in the U.S. 

Prior to the outbreak of the disease, only one race of C. heterostrophus 
(race O) was known in the field. In late 1969 when the disease became an 

1 0 epidemic, a new race of the fungus was identified from infected corn leaves 

collected in severely diseased areas. It was soon designated as race T because of 
its high virulence on T-cytoplasm corn and the ability to produce a phytotoxin 
called T-toxin, which specifically affects T-corn. In contrast, race O does not 
produce T-toxin and is mildly virulent on both T-cytoplasm and N-cytoplasm 

15 (normal cytoplasm) corn (Hooker et al., Plant Pis. Reptr. , 54:1 109 (1970); 
Scheifele, "Cytoplasmically Inherited Susceptibility to Diseases Related to 
Cytoplasmically Controlled Pollen Sterility in Maize," 25:1 10 (1970); Smith et 
al., Plant Pis. Rep. , 54:819 (1970); Yoder et al., Phytopathology, 65:273 (1975); 
Yoder, In: Biochemistry and Cytology of Plant Parasite Interaction , New York, 

20 New York:Elsevier, Tomiyama, eds., pp. 16-24 (1976); Yoder, Aim. Rev. 
Phvtopathol. , 18:103 (1980)). T-cytoplasm stands for Texas male sterile 
cytoplasm, a unique cytoplasm with a trait for maternally inherited male sterility, 
characterized by the failure to produce pollen (Levings, Science , 250:942 
(1990)). T-cytoplasm com was widely used for hybrid seed production and 

25 breeding to avoid hand or mechanical emasculation in the 1950s and the 1960s. 
It was the coexistence of large acreages of intensively planted T-cytoplasm com 
and the sudden appearance of race T of C. heterostrophus that resulted in the 
epidemic of the disease in 1970. This discovery first opened the door to 
understanding pathogenesis by C. heterostrophus. 

30 Early genetic analysis suggested that both T-toxin productioxi and high 

virulence on T-cytoplasm com are controlled by a single genetic locus defined as 
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Toxl (Leach et aL, Physiol. Plant Pathol. . 21:327 (1982)). This was 
demonstrated by crosses between race T and race O in which only parental 
phenotypes segregated in a 1:1 ratio (Tox+:Tox-); all T-toxin producing progeny 
are highly virulent on T-cytoplasm corn while all T-toxin nonpro during progeny 
5 are weakly virulent (Yoder et al., 1975, supra', Leach et al., 1982, supra). 
Further investigation by comparison of electrophoretic karyotypes and 
chromosome-specific DNA hybridizations indicated that Toxl is tightly linked to 
a reciprocal translocation breakpoint and is associated with as much as a 
megabase of DNA (mostly highly repeated and A+T-rich) that is missing in race 

10 O (Bronson, Genome . 30:12 (1988); Tzeng et aL, Genetics , 130:81 (1992); 

Chang et aL, Genome , 39:549 (1996)). Surprisingly, recent analysis of several 
Tox mutants revealed that Toxl is not a single locus but rather two loci, each on 
a different translocated chromosome (Y oder et aL, In Host-Specific Toxin: 
Biosynthesis, Receptor and Molecular Biology , Tottori, Japan: Faculty of 

15 Agriculture, Tottori Univ., Kohmoto, eds., pp. 23-32 (1994); Turgeon et aL, Can. 
J. Bot , 73:S1071 (1995)). These two Toxl loci have been designated ToxlA and 
ToxlB (Yoder et aL, 1997, supra). Two genes PKS1 and DEC1 have been 
cloned from the two loci respectively; both are required for biosynthesis of T- 
toxin and are found only in race T isolates of C heterostrophus (Y ang, "The 

20 Molecular Genetics of T-Toxin Biosynthesis by Cochliobolus heterostrophus'' 
Ph.D. Thesis, Cornell University (1995); Yang et aL, Plant Cell , 8:2139 (1996); 
Rose et aL, 8th Int. Svmn. Mol. Plant-Microbe hit. , Knoxville, p. J-49 (1996)). 

Genetic analysis also suggested that T-toxin is required by C. 
heterostrophus for its high virulence on T-cytoplasm corn. This hypothesis was 

25 first tested by the generation of induced T-toxin deficient mutants using different 
mutagenesis procedures. All mutants with a tight Tox phenotype cause disease 
symptoms that are indistinguishable from those caused by race O when tested on 
both T and N-cytoplasm corn, suggesting that T-toxin is indeed a virulence 
factor (Yang et aL, 1992; Lu et aL, Proc. Natl. Acad. Sci. USA , 91:12649 (1994); 

30 Rose et al. (1996), supra). This conclusion was firmly supported by the site- 
specific disruption of the PKS1 or DEC1 in the wild type race T genome; 
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disruptants lost the ability to produce T-toxins and caused race O type symptoms 
on both T-corn and N-corn (Y ang et al., 1996, supra; Rose et al., 1996, supra). 
These experiments have given a very clear resolution for the role of T-toxin in 
pathogenesis. They also implied that pathogenesis by C. heterostrophus must 
5 involve additional pathogenicity factors because race O which does not produce 
T-toxin and race T-derived Tox mutants are effective pathogens on corn. 

A number of fungal molecules have been identified as general 
pathogenicity or virulence factors in several plant-pathogenic fungi (Yoder et al., 
J. Genet, 75:425 (1996)). These include potential penetration factors such as 

10 melanin (Guillen et al., Fungal Genet. Newsl. , 41 :41 (1994)), cutinase (Oeser et 
al., Mol. Plant-Microbe Int. , 7:282 (1994)) and polygalacturonase and xylanase 
(Lyngholm et al., Fungal Genet. NewsL , 42:46 (1995)) or possible mechanisms 
involved in colonization such as phytotoxin detoxification (Schafer et al., 
Science , 246:247 (1989)) or components of signal transduction pathways. 

1 5 Although C. heterostrophus is known to produce a nonhost specific toxin called 
ophiobolin (or cochliobolin), a C25 sesterterpenoid compound, which is toxic to 
many organisms, including plants, bacteria, fungi and nematodes, there is no 
evidence that ophiobolins are involved in pathogenesis by C. heterostrophus or 
other phytopathogenic fungi. No other pathogenesis-related toxins have been 

20 isolated from C. heterostrophus so far, but studies on closely related 

Cochliobolus species and other phytopathogenic fungi suggest that pathogenesis 
by this group of fungi also involves peptide toxins. 

Four peptide phytotoxins (victorin, HC-toxin, AM-toxin, and enniatins) 
have been characterized as pathogenicity or virulence factors. They are all small 

25 cyclic peptides (4-6 residues), containing unusual amino acids or hydroxy acids, 
and they can be either host specific or non-host specific in terms of plant 
toxicity. A number of peptide phytotoxins are believed to be synthesized 
nonribosomally. Early in the 1960s, several biochemists working on the 
bacterial peptide antibiotics gramicidin and tyrocidine found that these 

30 polypeptides can be synthesized in RNAase-treated particle-free extracts of 

Bacillus brevis that are known to produce the same antibiotics; adding protein- 
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synthesis inhibitors to the extracts does not affect this process. This indicated 
the existence of a peptide biosynthetic system in which ribosomes and mRNAs 
are not needed. Further studies revealed that in this system, peptides are 
synthesized on a protein-template and this template itself is a multifunctional 
5 enzyme or a complex of several such enzymes, collectively called peptide 
synthetases, catalyzing the biosynthetic process (Laland et al., Essays in 
Biochemistry , 7:31 (1973); Lipmann, Adv. Microbiol. Physiol. , 21:277 (1980)). 

Peptide synthetases can catalyze biosynthesis of a variety of peptides, hi 
terms of bioactivity, they can be antibiotics, enzyme inhibitors, plant or animal 

10 toxins and immunosuppressants (Stachelhaus et al., Journal of Biological 

Chemistry , 270:6163 (1995)). In terms of chemical structure, they can be either 
linear (i.e., ACV, the penicillin precursor and gramicidin) or cyclic (most are). 
The latter can be further classified into three subgroups: 1) The "standard" cyclic 
peptides (i.e., gramidicin S, tyrocidine, HC-toxin and cyclosporin); 2) cyclic 

15 lactones (i.e., destruxin); and 3) cyclic depsipeptides (i.e., beauvericin and 

enniatin). There have been over 300 different carboxy compounds that can be 
activated by peptide synthetases. 

Although the first peptide synthetase, Gramicidin S synthetase, was 
purified and used for the cell-free synthesis of the peptide early in the 1960s 

20 (Tomino et al., Biochem , 6:2552 (1967)), the first bacterial peptide synthetase 
gene, tycA, which encodes the tyrocidine synthetase 1 in B. brevis, was not 
cloned until almost twenty years later (Marahiel et al., Mol. Gen. Genet., 
201 : 1 986 (1 985)). Since then, more than twenty peptide synthetase genes have 
been reported for both bacteria and filamentous fungi, but only fourteen have 

25 complete nucleotide sequences published. All are larger than 3.3 kb and range 
between 3.3-19.5 kb for bacterial genes and 9.4-45.8 kb for fungal ones. 
Interestingly, all fungal peptide synthetase genes reported lack introns, even the 
cyclosporin A synthetase gene simA, which has a 45.8 kb of open reading frame 
(the largest genomic ORP so far recorded). Although biosynthesis of bacterial 

30 peptides differs from that of fungal ones in terms of the number of 
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multifunctional enzymes involved, the genes encoding these enzymes are similar 
to each other in both function and structure. 

Comparison of nucleotide sequences reveals one or more highly 
conserved regions at certain positions in each peptide synthetase gene. These 
5 regions formerly called "amino acid activating domains" (Stachelhaus et al., 
1995, supra), now called "amino acid activating modules" (Marahiel, Chem. 
Biol. , 4:561 (1997)) consist of a set of domains (formerly called "modules") 
believed to have specific functions such as recognition, activation and 
thioesterification of individual constituent amino or hydroxy acids, and in some 

1 0 cases methylation and racemation for modification of certain residues before 
incorporation into the peptide chain (Stachelhaus et al., 1995, supra). The most 
convincing evidence supporting this assignment is that in most cases, the number 
of conserved functional units in each gene or gene cluster is equal to the number 
of amino acids in the respective peptide. This one-for-one match is very clear 

1 5 between three of four fungal peptides and their biosynthetic genes. The total 
number of modules in three of four bacterial gene clusters also matches the 
number of amino acids in the respective peptides. 

Sequence alignment of amino acid-activating modules reveals strictly 
conserved sequence motifs that contain active residues for module functions. 

20 These motifs are called "core sequences" (Marahiel, FEBS Lett. , 307:40 (1992)). 
A minimal amino acid-activating module must contain six core sequences, 
whose functions (except for core 1) have been proposed based on mutational 
analysis of several peptide synthetases. Core sequences 1-5 are grouped into an 
amino acid adenylation domain and core 6 is a thioester formation domain 

25 (Figure 1 A). All bacterial peptide synthetase genes contain "type I modules," the 
minimal amino acid activating modules which were previously called "type I 
domains" (Stachelhaus et al., 1995, supra). Two fungal genes, acvA and HTS1 
also have this modular structure* In addition to the type I module, two fungal 
genes, esynl and simA, contain type II modules, in which an insertion (about 400 

30 amino acids) is found between cores 5 and 6 of a normal type I module. This 

region contains a motif (VLE/DXGXGXG; SEQ ID NO:l), highly conserved in 
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S-adenosyl-mettaonine (SAM)-dependent methyltransferases, hence, it is 
referred to as a N-methylation domain (Figure 1 A). Additional evidence for 
methyltransferase activity of this module is that the number and position of type 
II modules in esynl, and simA exactly match that of N-methylated amino acids in 
5 ennatin and cyclosporin sequences (Figure IB). 

Although the modular structure described above is highly conserved 
among most peptide synthetase genes, some variations have been found in the 
latest cloned peptide synthetase gene safB, which is the first gene in the 
saframycin Mxl synthetase gene cluster (Pospiech et al., Microbiology . 

10 141:1793 (1995)). safB contains two type I amino acid activating modules. One 
module has all six highly conserved core sequences, but another, believed to 
activate alanine (the first amino acid in the linear tetrapeptide precursor of 
saframycin Mxl), lacks core 5 and has a weakly conserved core 1 (Pospiech et 
al., Microbiology . 142:741 (1996)) (Figure 1 A). This suggests that some of the 

15 motifs in the amino acid adenylation domain are dispensable or not critical for 
domain function. It also raises the possibility that other variations might be 
found in yet unknown peptide synthetase genes. 

Although C. heterostrophus has been a model eukaryotic plant pathogen 
since the 1970s, most molecular genetic analyses conducted in this system have 

20 focused on production of the polyketide T-toxin by race T isolates of the fungus. 
Solid evidence now indicates that T-toxin is a host-specific virulence factor in 
Southern Core Leaf Blight (Yoder et al., J. Genet. , 75:425 (1996); Yoder et al., 
1997). It is clear, however, that C. heterostrophus needs additional factors, 
presumably general factors for pathogenesis to corn plants, since race O, which 

25 does not produce T-toxin, can be an effective com pathogen. Attempts to 

identify additional general factors required by C. heterostrophus for pathogenesis 
have been unsuccessful. 

Thus, what is needed is the isolation and characterization of additional 
fungal genes that control the biosynthesis of novel fungal molecules associated 

30 with pathogenesis, i.e., genes which are potential targets for the design of 
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products that might interfere with the infection process, and vertebrate fungal 
orthologs of fungal peptide synthetase genes. 

Summary of the Invention 

5 The invention generally relates to an isolated nucleic acid molecule 

(polynucleotide), e.g,, DNA or RNA, comprising a nucleic acid segment which 
encodes a gene product related to pathogenesis. In one embodiment of the 
invention, fungal genes which are related to pathogenesis are identified. An 
advantage of the present invention is that the genes described herein provide the 

1 0 basis to identify a novel fungicidal or mycocidal mode of action which permits 
rapid discovery of novel inhibitors of gene products that are useful as fungicides 
or mycocides. In addition, the invention provides isolated genes or gene 
products from fungi for assay development for inhibitory compounds with 
fungicidal or mycocidal activity, as agents which inhibit the function or reduce or 

1 5 suppress the activity of those gene products in fungi are likely to have 

detrimental effects on fungi, and are good fungicide or mycocide candidates. 
The present invention therefore also provides methods of using a polypeptide 
encoded by one or more of the genes of the invention or a cell expressing such a 
polypeptide to identify inhibitors of the polypeptide, which can then be used as 

20 fungicides to suppress the growth of pathogenic fungi. Pathogenic fungi are 

defined as those capable of colonizing a host and causing disease. Examples of 
fungal pathogens include plant pathogens such as Septoria trici, Ashbya gossypii, 
Stagenospora nodorum, Botryus cinera, Fusarium graminearum, Magnaporthe 
grisea, Cochliobolus heterostrophus, Colleetotrichum, Ustilago maydis, 

25 Erisyphe graminis, plant pathogenic oomycetes such as Pythium ultimum and 
Phytophthora infestans, as well as dimorphic fungal pathogens including 
Blastomyces, e.g., B. dermatitidis, Coccidioides, Histoplasma, e.g., H. 
capsulatum, or Par acoccidiodes, e.g., P, brasiliensis, Loboa, Malassezia, 
Rhodotorrula, Blastoschizomyces, Trichosporon, Saccharomyces, Ctyptococcus 

30 including Cryptococcus neofomans, as well as human pathogens such as 
Candida albicans, and other pathogenic Candida, e.g., C. tropicalis, C. 
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parapsolosis and C. guiettermondii, Coccidioidus imitis, and Aspergillus 
fumigatus, Sporothrix schenckii, pathogenic members of the Genera 
Epidermophyton, Microsporum and Trichophyton,, Cladosporium (Xylohypha) 
trichoides, Cladosporium bantianum, Penicillium marnefii, Exophiala 
5 (Wangiella) dermatitidis, Fonsecaea pedrosoi and Dactylaria gallopava 

(Ochroconis gallopavum), and including mycogens. Preferred fungi for use with 
the agent identified by the method of the invention are As corny cota. 

hi one embodiment of the invention, the invention relates to an isolated 
polynucleotide comprising a nucleic acid segment encoding an ortholog of a 

10 plant fungal CPS1, e.g., SEQ ID NO:3 from Cochliobolus which is a CoA ligase, 
or a nucleic acid segment encoding a gene product that modulates fungal iron 
metabolism, uptake, absorption of inorganic or organic ferric salts, e.g., a fungal 
iron reductase, permease or MFS transporter, e.g., a siderophore transporter, 
which genes maybe associated with CPS1 in a gene cluster. As described herein 

15 below, a gene from Coccidioidus imitis and Candida that is related to the CPS1 
gene of Cochliobolus was identified, e.g., a nucleic acid sequence comprising an 
open reading frame comprising SEQ ID NO:46 which encodes SEQ ID NO:47 or 
the complement thereof. The CPS1 gene in Cochliobolus is present in a cluster 
of closely linked open reading frames, a cluster which is associated with 

20 virulence and/or pathogenicity, wherein CPS 1 is representative of a novel class 
of adenylation domain-containing enzymes related to but distinct from 
nonribosomal protein synthetases (NRPSs). Thus, at least one of the genes in the 
cluster may control biosynthesis of a secondary metabolite (small molecule) that 
is required for or associated with fungal virulence and/or pathogenesis. 

25 Similarly, orthologs of the described Cochliobolus gene cluster, e.g., those in 
Coccidioidus or Candida, may encode gene products that are required for or 
associated with fungal virulence. As also described hereinbelow, a Cochliobolus 
iron reductase (SEQ ID NO:49 encoded by SEQ ID NO:48) and a permease 
and/or MFS transport protein gene (SEQ ID NO:55 encoding SEQ ID NO:56) 

30 were identified that are closely linked to a CPS1 peptide synthetase gene, e.g., a 
DNA molecule comprising SEQ ID NO:2 (GenBank accession no. AF332878) 
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encoding SEQ ID NO:3 (GenBank accession no. AAG53991), which is part of a 
gene cluster associated with virulence and/or pathogenicity. 

Thus, at least one of the genes in the cluster may control biosynthesis of 
at least one secondary metabolite or other small molecule that is required for or 
5 associated with fungal growth, virulence and/or pathogenesis. The fungal 

produced siderophore may sequester iron from the environment or host to aid in 
fungal growth. Pseudomonas aeruginosa produces pigments that are likely 
associated with virulence, e.g., pyocyanin. A derivative of pyrocyanin, 
pyochelin, is a siderophore that is produced under low iron conditions to 

1 0 sequester iron from the environment for growth of the pathogen. The 

competition for iron may have a deleterious effect on the host. Similarly, the 
Cochliobolus iron reductase or permease/transporter or other gene products 
associated with iron metabolism may compete with the host for Fe and so 
contribute to the pathogenicity of the fungus. Similarly, orthologs of the 

1 5 described genes in the Cochliobolus gene cluster in other fungi which infect 

plants or those that infects vertebrate animals may encode gene products that are 
required for or associated with fungal virulence including iron metabolism genes, 
e.g., genes associated with secretion of a toxin or siderophore. 

Preferably, the nucleic acid segment is obtained or isolatable from a 

20 fungal gene which encodes a polypeptide which is substantially similar, and 

preferably has at least 70%, e.g., 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 
79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, and even 90% or 
more, e.g., 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, up to at least 99%, 
amino acid sequence identity to, a polypeptide encoded by a nucleic acid 

25 sequence comprising any one of SEQ ID NO:46, SEQ ID NO:48, SEQ ID 
NO:55, or a fragment (portion) thereof which encodes a partial length 
polypeptide having substantially the same activity of the full length polypeptide. 
Preferably, the activity of the partial length polypeptide is at least 50%, generally 
at least 60%, ordinarily at least 70%, preferably at least 80%, more preferably at 

30 least 90% and more preferably still at least 95% the activity as the full-length 
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polypeptide. Preferred partial length polypeptides have substantially the same 
activity as the corresponding full-length polypeptide. 

Further provided is an isolated polynucleotide comprising a nucleic acid 
segment which is substantially similar, and preferably has 70%, e.g., 71%, 72%, 
5 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 
87%, 88%, 89%, and even 90% or more, e.g., 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, up to at least 99%, nucleotide sequence identity to, a nucleic acid 
sequence comprising an open reading frame comprising any one of SEQ ID NO: 
46, SEQ ID NO:48, or SEQ ID NO:55. 

10 Another aspect of the present invention, as described below, relates to a 

method for identifying inhibitors of the gene products encoded by the 
polynucleotides of the invention, which involves contacting the gene product or 
cell expressing the polynucleotide with agents that are potential inhibitor 
compounds, and selecting compounds which decrease the activity of the gene 

15 product and/or inhibit cell growth, hi another embodiment, the invention relates 
to a method of imparting disease resistance to a plant or other organism by 
overexpression the CPS1 ortholog of the invention in the plant or other 
organism. 

The nucleic acid molecules of the invention are preferably obtained or 
20 isolatable from a gene from fungi that infect vertebrates, including but not 

limited to mammals, e.g., livestock such as bovine, ovine, porcine, equine and 
avians such as turkey and chickens and domestic pets including avians, feline 
and canine, and humans, which genes are related to pathogenesis. For example, 
preferred nucleic acid molecules of the invention are obtained or isolatable from 
25 Ascomycetes (ascomycetes), and the agents of the invention are useful to treat 
infections due Ascomycota infection, based on the discovery of CPS1, its 
orthologs and related genes in the cluster, in various ascomycetes human (and 
plant) pathogens as disclosed herein. Within pathogenic Ascomycetes, the 
following groups are of interest: Agyriales, Arihoniales, Ascosphaerales, 
30 Caliciales , Calosphaeriales, Capnodiales, Chaetothyriales (black yeasts), 

Cyttariales, Diaporthales, Dothideales, Elaphomycetales, Erysiphales (powdery 
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mildews), Eurotiales (green and blue mold), Gyalectales ,Halosphaeriales, 
Helotiales, Hypocreales, Laboulbeniales, Lecanorales, Lulworthiales, 
Melanommata les, Meliolales, Microascales, Myriangiales, Neolectales, 
Onygenales, Ophiostomatales, Ostropales, Patellariales, Pertusariales, 
5 Pezizales, Phyllachorales, Pleosporales, Protomycetales, Pyrenulales, 
Rhytismatales, Saccharomycetes, Schizosaccharomycetales, Sordariales, 
Taphrinales, Teloschis tales, Thelebolaceae, Umbilicariales, Xylariales, 
anamorphic Ascomycota, unclassified Ascomycota, and Ascomycota incertae 
sedis, 

1 0 Regarding Ascomycetes animal pathogens, preferred are pathogenic 

Onygenales, more particularly the anamorphic Onygenales, which includes 
coccidioides, and the Onygenaceae and its group Ajellomyces, which includes 
Histoplasma such as Histoplasma capsulatum, and Blastomycoides such as 
Blastomycoides dermatitidis. Also preferred are pathogenic Saccharomycetes, 

1 5 more preferably Saccharomycetales, and even more preferably anamorphic 
Saccharomycetales, which includes Candida species. Also preferred are 
Chaetothyriales, more preferably Herpotrichiellaceae, even more preferably 
anamorphic Herpotrichiellaceae, and even more preferably Exophiala, which 
include the human-pathogenic organisms Exophiala dermatitidis and Exophiala 

20 jeanselmei. Also preferred are the Onygenales, more preferably 

Arthrodermataceae, more preferably anamorphic Arthrodermataceae, and even 
more preferably Trichophyton, which contain Trichophyton rubrum. Another 
preferred group is Fungi incertae sedis, more preferably Pneumocystidaceae, and 
even more preferably Pneumocystis, which includes the human pathogen 

25 Pneumocystis carinii. Yet another preferred group is Eurotiales, more preferred 
Trichocomaceae, even more preferred anamorphic Trichocomaceae, and yet 
even more preferred is Aspergillus species, which contains Aspergillus 
avenaceus and Aspergillus fumigatus. Another preferred group are those 
pathogenic fungi in Pleosporales, more preferably Pleosporaceae, yet more 

30 preferably anamorphic Pleosporaceae, and even more preferably Alternaria 
species, which includes airborne Alternaria altemata. Also preferred is 
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Ascomycota incertae sedis, more preferably Mycosphaerellaceae, particularly the 
anamorphic Mycosphaerellaceae, and more preferably the species 
Cladosporium, which includes airborne human pathogens. Also preferred are 
anamorphic Ascomycota, more preferably the species Helminthosporium. 
5 Within Onygenales are preferably anamorphic Onygenales, and more preferably 
the Paracoccidioides species, which includes Paracoccidioides brasiliensis. 
Also preferred are Microascales, more preferably Microascaceae, and even more 
preferably Pseudallescheria species, which includes Pseudallescheria boydii. 
Also preferred are Ophiostomatales, more preferably Ophiostomataceae, yet 

10 more preferably anamorphic Ophiostomataceae, and more preferably Sporothrix 
species, including Sporothrix schenckii. 

The term "substantially similar", when used herein with respect to a 
polypeptide means a polypeptide corresponding to a reference polypeptide, 
wherein the polypeptide has substantially the same structure and function as the 

15 reference polypeptide, e.g., where the only changes in amino acid sequences are 
those which do not affect the polypeptide function. When used for a polypeptide 
or an amino acid sequence, the percentage of identity between the substantially 
similar and the reference polypeptide or amino acid sequence is at least 70%, 
e.g., 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 

20 84%, 85%, 86%, 87%, 88%, 89%, and even 90% or more, e.g., 91%, 92%, 93%, 
94%, 95%, 96%, 97%, 98%, up to at least 99%, wherein the reference 
polypeptide comprises SEQ ID NO:47, SEQ ID NO:49 or SEQ ID NO:56. One 
indication that two polypeptides are substantially similar to each other is that an 
agent, e.g., an antibody, which specifically binds to one of the polypeptides, 

25 specifically binds to the other. 

In its broadest sense, the term "substantially similar", when used herein 
with respect to a nucleotide sequence or nucleic acid segment, means a 
nucleotide sequence or segment corresponding to a reference nucleotide 
sequence or nucleic acid segment, wherein the corresponding sequence encodes a 

30 polypeptide having substantially the same structure and function as the 
polypeptide encoded by the reference nucleotide sequence or nucleic acid 
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segment. The term "substantially similar" is specifically intended to include 
nucleotide sequences wherein the sequence has been modified to optimize 
expression in particular cells. The percentage of identity between the 
substantially similar nucleotide sequence and the reference nucleotide sequence 
5 is at least 70%, e.g., 71% 72%, 73 %, 74% 75% 76% 77% 78% 79% 80% 
81% 82% 83% 84% 85% 86%, 87%, 88%, 89%, and even 90% or more, e.g., 
91%, 92%, 93%, 94%,, 95%, 96%, 97%, 98%, up to at least 99%, preferably 
wherein the reference sequence comprises SEQ ID NO:46, SEQ ID NO:48, SEQ 
ID NO:55 or the complement thereof Sequence comparisons maybe carried out 
10 using a Smith- Waterman sequence alignment algorithm (see e.g., Waterman, 
Introduction to Computational Biology: Maps, sequences and genomes, 
Chapman & Hall, London (1995) or 

http://www.htousc.edu/software/seqaln/index.htmD . The local S program, 
version 1.16, is preferably used with following parameters: matl, mismatch 

15 penalty: 0.33, open-gap penalty:2, extended-gap penalty:2. Further, a nucleotide 
sequence that is "substantially similar" to a reference nucleotide sequence 
hybridizes to the reference nucleotide sequence under moderate, stringent, or 
very stringent, hybridization conditions, e.g., in 7% sodium dodecyl sulfate 
(SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C with washing in 2X SSC, 0.1% SDS 

20 at 50°C, more desirably in 7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 

mM EDTA at 50°C with washing in IX SSC, 0.1% SDS at 50°C, more desirably 
still in 7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C 
with washing in 0.5X SSC, 0.1% SDS at 50°C, preferably 7% sodium dodecyl 
sulfate (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C with washing in 0.1X SSC, 

25 0.1% SDS at 50°C, more preferably in 7% sodium dodecyl sulfate (SDS), 0.5 M 
NaP0 4 , 1 mM EDTA at 50°C with washing in 0.1X SSC, 0.1% SDS at 65°C. 

Thus, the invention also includes recombinant nucleic acid molecules 
which have been modified so as to comprise codons other than those present in 
the unmodified sequence or have been modified by shuffling. The recombinant 

30 nucleic acid molecules of the invention include those in which the modified 

codons in the unmodified sequence, as well as those that specify different amino 
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acids, i.e., they encode a variant polypeptide having one or more amino acid 
substitutions relative to the polypeptide encoded by the unmodified sequence. 

The invention further includes a nucleotide sequence which is 
complementary to one (hereinafter "test" sequence) which hybridizes under 
5 stringent conditions with the nucleic acid molecules of the invention as well as 
RNA which is encoded by the nucleic acid molecules of the invention as well as 
RNA which is encoded by the nucleic acid molecule. When the hybridization is 
performed under stringent conditions, either the test or nucleic acid molecule of 
the invention is preferably supported, e.g., on a membrane or DNA chip. Thus, 

10 either a denatured test or nucleic acid molecule of the invention is preferably first 
bound to a support and hybridization is effected for a specified period of time at 
a temperature of, e.g., between 55 and 70°C, in double strength citrate buffered 
saline (SC) containing 0.1% SDS followed by rinsing of the support at the same 
temperature but with a buffer having a reduced SC concentration. Depending 

15 upon the degree of stringency required such reduced concentration buffers are 
typically single strength SC containing 0.1% SDS, half strength SC containing 
0.1% SDS and one-tenth strength SC containing 0.1% SDS. 

Hence, the isolated nucleic acid molecules of the invention include 
orthologs of SEQ ID NO:46, SEQ ID NO:48 and SEQ ID NO:55, which includes 

20 orthologs of the polypeptides encoded therein. An ortholog is a gene from a 

different species that encodes a product having the same function as the product 
encoded by a gene from a reference organism. The encoded ortholog products 
likely have at least 68 to 70% (substantial) sequence identity to each other. 
Hence, one embodiment the invention includes an isolated polynucleotide 

25 comprising a nucleic acid segment encoding a polypeptide having at least 68 to 
70% identity to a polypeptide encoded by SEQ ID NO:46, SEQ ID NO:48 or 
SEQ ID NO:55. Databases such as GenBank which can be accessed at 
http://www.ncbi.hlm.hih.gov/ , may be employed to identify sequences related to 
those sequences. Alternatively, recombinant DNA techniques such as 

30 hybridization or PGR may be employed to identify sequences related to the 

sequences. Preferred orthologs include those from dimorphic fungal pathogens 
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including Blastomyces, e.g., B. dermatitidis, Coccidioides, Histoplasma, e.g.,H. 
capsulation, or Paracoccidiodes, e.g., P. brasiliensis, Loboa, Malassezia, 
Rhodotorrula, Blastoschizomyces, Trichosporon, Saccharomyces, Cryptococcus 
including Cryptococcus neofomans, as well as human pathogens such as 
5 Candida albicans, and other pathogenic Candida, e.g., C. tropica lis, C. 
parapsolosis and C. guiettermondii, Coccidioidus imitis, and Aspergillus 
fumigatus, Sporothrix schenckii, pathogenic members of the Genera 
Epidermophyton, Microsporum and Trichophyton, Cladosporium (Xylohypha) 
trichoides, Cladosporium bantianum, Penicillium marnefli, Exophiala 

10 (Wangiella) dermatitidis, Fonsecaea pedrosoi and Dactylaria gallopava 
(Ochroconis gallopavum), as well as other mycogens. 

The invention also provides anti-sense nucleic acid molecules 
corresponding to the sequences described herein. Also provided are expression 
cassettes, e.g., recombinant vectors, and host cells, comprising the nucleic acid 

1 5 molecule of the invention in which the nucleic acid segment is in either sense or 
antisense orientation. Also provided is a microarray, comprising one or more of 
the nucleic acid molecules of the invention or a portion thereof. 

Owing to the dramatically increased incidence of life-threatening 
opportunistic fungal infections it is now clear that diseases of fungal infection 

20 are of major importance. The rise in cases has been particularly apparent in 

transplant recipients and others who are immunocompromised, especially AIDS 
patients. Besides more serious infections associated with these vulnerable 
groups, superficial infections such as ringworm and thrush have also become 
more prevalent. Despite recognizing the importance of fungi as a cause of 

25 disease in man and animals, many of the more serious fungal infections remain 
difficult to diagnose and treat. Thus, there is a continuing need to identify agents 
to treat fungal infections of vertebrates, including immunocompromised 
vertebrates, and complications thereof, e.g., pneumonia, flulike illness, erythema 
nodosum, erythema marginatum, arthritis, multiple thin- walled chronic cavities, 

30 miliary disease, bone and joint infection, skin disease, soft tissue abscesses, 
meningitis, oropharyngitis, oesophagitis, vaginitis, onychomycosis, 
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endophthalmitis, paronychia, and inflammation of the urinary tract, kidney, 
lever, brain, gastrointestinal tract, and lung. 

Thus, another aspect of the present invention relates to a method for 
identifying inhibitors of the fungal vertebrate CPS1 ortholog, or fungal iron 
5 reductase or permease/MFS transporter of the invention. For example, genes 
encoding products that are associated with virulence, and agents that bind to or 
otherwise alter or modulate the activity of that gene product, preferably agents 
that inactivate or decrease (reduce or inhibit) the activity of the gene product, can 
be identified. The method comprises contacting the gene product(s) or cells 

10 which express the gene product(s) with an agent and then determining or 
detecting whether the agent binds to, or decreases the activity of, the gene 
product(s). Such an agent modulates or alters a phenotype of the gene product or 
cell, e.g., pathogenicity of a cell which expresses the gene product. Modulation 
or alteration encompasses an increase as well as a decrease in an activity, 

1 5 preferably the modification or alteration in the activity of the gene product or cell 
having the gene product contacted with the agent is at least 10%, or at least 50%, 
relative to the activity in an untreated control. In particular, the methods are 
useful to identify agents that inhibit, reduce or suppress the activity of the 
polypeptide, e.g., by at least 10%, preferably at least 50%, relative to the activity 

20 in an untreated control. Thus, the invention also provides agents identified by 
the methods of the invention. Preferred agents bind to, more preferably inhibit, 
the activity of a polypeptide of the invention, e.g., one encoded by a dimorphic 
fungal pathogen such as one from Blastomyces, Coccidioides, Histoplasma or 
Paracoccidiodes, and includes pathogenic Candida^ e.g., C. albicans, C. 

25 tropicalis, C. parapsolosis and C. guiettermondii. The methods may employ 

screening agents on wild type fungi and/or recombinant fungi, e.g., fungi which 
overexpress the polypeptide of interest or do not express that polypeptide, e.g., as 
a result of expression of antisense sequences or a gene knock out. If the agent is 
one encoded by DNA, the expression of that DNA in an organism susceptible to 

30 the pathogen, e.g., a plant, may provide tolerance or resistance to the organism to 
the pathogen, preferably by inhibiting or preventing pathogen infection. 
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Methods of the invention may include stably transforming a susceptible 
organism of cell with one or more sequences which confer tolerance or resistance 
operably linked to a promoter capable of driving expression of that nucleotide in 
the cells of the organism. 
5 Other uses for the nucleic acid molecules or polypeptides of the 

invention, include the use of the polypeptide to raise either polyclonal antibodies 
or monoclonal antibodies, e.g., antibodies specific for the polypeptide, to detect 
antibodies in the serum of a vertebrate, or primers or probes specific for the 
nucleic acid molecules, which can be employed in diagnostic assays for the 

1 0 presence of the pathogen or for therapeutic purposes, and host cells comprising 
the nucleic acid molecules, e.g., in antisense orientation, or having a deletion in 
at least a portion of at least one the genes corresponding to the nucleic acid 
molecules of the invention. Also, given that the gene may encode a peptide 
synthetase (Watanabe et al., Chem. Biol. , 3, 463 (1996)) the gene product maybe 

15 useful in therapy, e.g., as an anti-cancer agent, an antibiotic, or as an 
immunosuppressant. 

The agents identified by the methods of the invention may also be 
subjected to further assays to determine whether the agent is substantially non- 
toxic to a plant or vertebrate organism to be treated as well as the dose to be 

20 administered to the vertebrate organism. For example, for Coccidioides, a 
murine model may be employed (see, Kirland et al., Infect. Immun. , 40: 912 
(1983)). This model may also be used for screening for an agent of the 
invention. Further, the agents identified by the methods of the invention, e.g., 
those which are non-toxic to a plant or vertebrate to be treated, are useful in 

25 methods of preventing or treating a disease or disorder associated with fungal 
infection, including superficial, subcutaneous or systemic infections. The 
method comprises administering to a vertebrate or plant in need of such 
treatment, e.g., a vertebrate that is immunocompromised, an amount of an agent 
of the invention effective to inhibit or prevent fungal or mycogen infection or 

30 growth. For example, humans and non-human animals including livestock and 
domestic pets may be treated with the agents of the invention, e.g., livestock 
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such as bovine, ovine, porcine, equine and avians such as turkey and chicken and 
domestic pets including avians, felines and canines. Preferably, the agents are 
administered topically to a mammal such as a human. Preferred plants include 
cereals, for example, corn, alfalfa, sunflower, rice, Brassica, canola, soybean, 
5 barley, soybean, sugarbeet, cotton, safflower, peanut, sorghum, wheat millet, and 
tobacco. 

Moreover, the agents of the invention may be used in conjunction with 
other therapeutic agents, e.g., fungicides, mycosides, and vaccines, including 
amphotericin B and azoles. In addition, the agents maybe employed to treat 

10 sources of fungal contamination, such as the soil or surface areas or materials on 
which fungi can survive and/or proliferate. Thus, the agents may be contacted 
with soil or other surfaces that come in contact with vertebrates. Although this 
contacting may not eliminate the fungus, it may reduce the risk of airborne 
dissemination of the fungus or its spores. 

1 5 Also provided is a computer readable medium having stored thereon a 

nucleic acid sequence that is substantially similar to any one of SEQ ID NO:46, 
SEQ ID NO:48, SEQ ID NO:55 or the complement thereof, and a computer 
system comprising a processor and data storage device wherein said data storage 
device has stored thereon a nucleic acid sequence that is substantially similar to 

20 any one of SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:55 or the complement 
thereof. Preferably, the computer system comprises an identifier which identifies 
features in said sequence. Further provided is a database comprising at least one 
nucleotide sequence in computer readable form wherein said nucleotide 
sequence is substantially similar to any one of SEQ ID NO:46, SEQ ID NO:48, 

25 SEQ ID NO:55, or the complement thereof. The database, for example, carries 
out functions comprising determining homology, aligning sequences, adjusting 
sequence alignments, assembling sequences having overlapping sequence, 
predicting gene sequence, predicting intron borders, identifying motifs, 
identifying domains, identifying untranslated regulatory sequences, identifying 

30 putative sequencing errors, carries out functional genomics analyses, or carries 
out shuffling of nucleotide sequences. 
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The invention also provides a method for generating nucleotide 
sequences encoding polypeptides having at least one region of homology to SEQ 
ID NO:46, SEQ ID NO:48, SEQ ID NO:55, or the complement thereof The 
method comprises shuffling an unmodified nucleotide sequence which is 
5 identical or substantially identical to SEQ ID NO:46, SEQ ID NO:48 5 SEQ ID 
NO: 55, or the complement thereof The resulting shuffled nucleotide sequence 
is expressed and a gene product encoded thereby is selected for altered activity as 
compared to the activity in a polypeptide encoded by SEQ ID NO:46, SEQ ID 
NO:48, or SEQ ID NO:55. A DNA molecule comprising a shuffled nucleotide 

10 sequence obtainable or produced by the method is also provided. In one 
embodiment, the shuffled DNA molecule encodes a polypeptide having 
enhanced tolerance to an inhibitor of the polypeptide encoded by SEQ ID NO:46, 
SEQ ID NO:48, or SEQ ID NO:55. The shuffled DNA molecule may be 
operably linked to a promoter to form a chimeric molecule which is introduced 

15 to a host cell, e.g., a plant cell. 



Brief Description of the Figures 

Figure 1 provides the structure of amino-acid activating modules 
identified in peptide synthetase genes (adapted from Stachelhaus and Marahiel, J. 

20 Biol. Chem. , 270, 6163, 1995; Stachelhaus and Marahiel, FEMS Microbiol. 
Lett. , 125 , 3, 1995; Pospiech 1995, supra; Marahiel, 1997, supra). Figure 1A 
shows the domain arrangements in two types of modules. Structural variations 
in the first module (safBl) of the gene safB are also indicated below type I. 
Figure IB shows the correlation between module types and the nature of residues 

25 in two fungal peptides. Open box: type I module; filled box: type II module. 
Each peptide sequence is given below. 

Figure 2 is a restriction map of the cloned sequences surrounding the 
tagged site. A 1 1 .3 kb genomic region (thick line) was cloned and completely 
sequenced. The original REMI insertion point in the mutant R.C4.2696 is 

30 indicated by a vertical arrow. The asterisks indicate two targeted integration 

sites in the wild type genome. Two open reading frames (in opposite directions), 
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ORF 1 (CPS1 9 5.4 kb) and ORF2 (725X7, 1 . 1 kb) are indicated by open boxes 
below the map (the positions of putative introns are indicated by vertical bars). 
Locations of seven overlapping plasmid clones used for sequencing are indicated 
by thin lines on the top of the map (filled triangles represent the vector sequence 
5 in each clone). Sequencing strategy is indicated by arrow above each clone line. 
Figures 3 A-C are schematic representations which show the 
characterization of modular structure of CP SI. Peptide synthetase and 
thioesterase are indicated by open boxes; shaded boxes inside indicate functional 
domains and modules; vertical bars in the shaded boxes indicate highly 

10 conserved core sequences. Figure 3 A illustrates the general structure of bacterial 
and fungal peptide synthetases (adapted from Marahiel, 1997, supra). A peptide 
synthetase gene cluster is shown on the top. There can be one or more amino 
acid activating module (cyclosporine synthetase has 1 1) in each protein; some 
peptide synthetases have thioesterase domains (TE), which can be either 

15 integrated into modules or encoded by a separate gene. Each synthetase can have 
type I, type II or both modules. A type I (minimal) module is enlarged to show 
organization of core sequences and domains. Some peptide synthetases also 
have condensation or epimerization domains. Figure 3B illustrates the 
organization of saframycin Mxl synthetase containing 4 amino acid activating 

20 modules (Pospiech et al. ? 1996, supra). SafBl from the first module is enlarged. 
Core sequences 1 and 5 in safBl are weakly conserved (indicated by dashed 
vertical bars). The remaining domains are typical of type I as shown in Figure 
3A. SafC is a putative O-methyltransferase. Figure 3C illustrates the 
organization of CPS1. Sequence analysis revealed two amino acid activating 

25 modules (CPS 1 A and CPS1B), both of which have high similarity to safB 1 

except that core 2 is weakly conserved. A thioesterase domain is found at the C- 
terminal region of CPS IB. Three vertical arrows indicate the positions of 
targeted gene disruptions in the wild type genome that yielded the mutant 
phenotype. TES1 is a thioesterase encoded by a separate gene (TES1). 

30 Figures 4A-C depict DNA gel blots showing DNA-DNA hybridization of 

ChCPSl to other fungal genera and species. (A) Cochliobolus species (1-17): C. 
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heterostrophus race T, race O; C carbonum race 1, race 2; C victoriae isolates 
FI3, HvW; C. bicolor, C dactyloctenii, C. chloridis, C. homomorphus, C 
intermedins \ C. melinidis, C. melinidis y C peregianensis, C perotidis, C. 
ravenelii and C. sativus. (B) Other Ascomycete genera (1-14): C. carbonum 
5 racel (control), Setosphaeria rostrata, Stemphylium spp., Pyrenophora tritici 
repentis, Bipolaris sacchari, Alternaria spp., A solani, Nectria haematococca, 
Fusarium oxysporum, Glomerella spp. Magnaporthe grisea, F, moniliforme, F. 
moniliforme (repeat) and A solani (repeat). (C) Candida albicans compared to 
C. heterostrophus and closely related species (1-7): C heterostrophus race T, 

10 Bipolaris sacchari, Setosphaeria rostrata, Stemphylium spp., Pyrenophora tritici 
repentis, Alternaria spp. and Candida albicans (arrowhead). Genomic DNAs 
were digested with Hindlll (A, lanes 1-17; B, lanes 1-11; C, lanes \-l),Xliol (B, 
lanes 12 and 14) or BglW (B, lane 13) and probed with the 3.2 kb fragment of 
CPS1 at high stringency. Weak signals in lanes 3 and 17 (panel A) are due to 

1 5 insufficient DNA loading (confirmed by a repeat experiment). 

Figures 5A-B show similarity of the cloned CPS1 homologs to C. 
heterostrophus CPS1. (A) Structural comparison of the four CPS1 homologs to 
ChCPSl (As = Alternaria solani; Pt ~ Pyrenophora teres; Fg = Fusarium 
graminearium; Ci = Coccidioides imitus). ORFs are indicated by the open 

20 boxes; shaded boxes inside indicate functional domains; vertical bars indicate 
conserved motif sequences found in nonribosomal peptide synthetases (NRPS) 
as defined by Stachelhaus and Marahiel (Stachelhaus and Marahiel, 1995, supra; 
Marahiel, 1997, supra) (dashed bars indicate weak conservation). The black 
bulbs indicate the position of putative introns. Cores 1-5: adenylation; core 6: 

25 thiolation; TE: thioesterase. The distance between core sequences is not drawn 
in exact scale. The name of proteins is on the left of ORF box and the number of 
amino acids on the right. The unidentified regions of AsCPSl, PtCPSl and 
CiCPSl are indicated by dash-lined boxes. The similarity to ChCPSl (in the 
overlapping region only) is given in the parentheses under the protein names in 

30 the order: nucleotide identity/ amino acid identity/ amino acid similarity. The 
positions of the ChCPSl amino acids 220 and 1040 (corresponding to the first 
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and the last amino acid of CiCPSl) are indicated by open arrows; the positions 
51 1 and 1269 (to the first and the last amino acids of AsCPSl and PtCPSl) are 
indicated by filled triangles. (B) Amino acid alignment of the four CPS1 
homologs to ChCPSl . 530 amino acids aligned to the amino acids 51 1-1040 of 
5 ChCPSl (SEQ ID NO:186) are shown (SEQ ID NOs: 51-54). The identical 
residues are in uppercase and the similar residues in lowercase. Consensus of 
sequences similar to the typical NRPS signature motifs is underlined. The 
putative cyclization domain motif "DXXXXD/ EXXS/ A" (SEQ ID NO:60) is 
underlined. 

10 Figure 6 shows the results of a BLAST search using FgCPSl (SEQ ID 

NO:41) as the query sequence. 

Figure 7 A shows the results of a BLAST search using CiCPSl (SEQ ID 
NO:47) as the query sequence. 

Figure 7B shows an alignment of amino acid sequence of FgCPSl (SEQ 

15 ID NO:41) 5 AsCPSl (SEQ ID NO:43), PtCPSl (SEQ ID NO:45), CiCPSl (SEQ 
ID NO:47), and ChCPSl (SEQ ID NO:3). 

Figures 8A-C show the sequencing strategy (A), restriction map (B), 
genome organization (C) for the ChCPSl gene cluster. SEQ ID NO:59 
represents the sequence of genes clustered near ChCPSl. SEQ ID NO: 187 and 

20 188 represent the DNA corresponding to and amino acid sequence encoded by 
ORF 16, respectively. SEQ ID NO:189 and 190 represent the DNA 
corresponding to and amino acid sequence corresponding to ORF 10, 
respectively. SEQ ID NO: 191 and 192 represent the DNA corresponding to and 
amino acid sequence encoded by ORF 11, respectively. SEQ ID NO:193 and 

25 194 represent the DNA corresponding to and amino acid sequence encoded by 
ORF 12, respectively. SEQ ID NO: 195 and 196 represent the DNA 
corresponding to and amino acid sequence encoded by ORF 13, respectively. 
SEQ ID NO: 197 and 198 represent the DNA corresponding to and amino acid 
sequence encoded by ORF 14, respectively. SEQ ID NO: 1 99 and 200 represent 

30 the DNA corresponding to and amino acid sequence encoded by ORF 3, 

respectively. SEQ ID NO:201 and 202 represent the DNA corresponding to and 
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amino acid sequence encoded by ORF 5, respectively. SEQ ID NO:203 and 204 
represent the DNA corresponding to and amino acid sequence encoded by ORF 
6, respectively. SEQ ID NO:205 and 206 represent the DNA corresponding to 
and amino acid sequence encoded by ORF 7, respectively. SEQ ID NO:207 and 
5 208 represent the DNA corresponding to and amino acid sequence encoded by 
ORF 8, respectively. SEQ ID NO:209 and 210 represent the DNA 
corresponding to and amino acid sequence encoded by ORF 9, respectively. 

Figure 9A shows the results of a BLAST search using SEQ ID NO:49 (an 
iron reductase encoded by SEQ ID NO:48) as the query sequence. 
10 Figure 9B shows an alignment of amino acid sequence of a Cochliobolus 

iron reductase (SEQ ID NO:49) and a S. cerevisiae reductase (SEQ ID NO: 184). 

Figure 9C illustrates a DNA comprising SEQ ID NO:48 (SEQ ID 
NO:211). 

Figure 9D illustrates the amino acid sequence (SEQ ID NO:212) encoded 
15 by SEQIDNO:211. 

Figure 10 shows the results of a BLAST search using the polypeptide 
(SEQ ID NO:56)encoded by SEQ ID NO:55 (a Cochliobolus permease and/or 
MFS transporter) as the query sequence. 

20 Detailed Description of the Invention 

Definitions 

The tenn "nucleic acid" refers to deoxyribonucleotides or ribonucleotides 
and polymers thereof in either single- or double-stranded form, composed of 
monomers (nucleotides) containing a sugar, phosphate and a base which is either 

25 a purine or pyrimidine. Unless specifically limited, the term encompasses 
nucleic acids containing known analogs of natural nucleotides which have 
similar binding properties as the reference nucleic acid and are metabolized in a 
manner similar to naturally occurring nucleotides. Unless otherwise indicated, a 
particular nucleic acid sequence also implicitly encompasses conservatively 

30 modified variants thereof (e.g., degenerate codon substitutions) and 

complementary sequences as well as the sequence explicitly indicated. 
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Specifically, degenerate codon substitutions may be achieved by generating 
sequences in which the third position of one or more selected (or all) codons is 
substituted with mixed-base and/or deoxyinosine residues (Batzer et al., Nuci 
Acids Res.. 19:508 (1991); Ohtsuka et al., JBC, 260:2605 (1985); Rossolini et 
5 al., Mol. Cell. Probes , 8:91 (1994). Although nucleotides are usually joined by 
phosphodiester linkages, polymeric nucleotides joined by peptide linkages 
(peptide nucleic acids) are also included (Neilsen and Egholm, Peptide 
Nucleotide Acids: Protocols and Applications, Horizon Scientific Press, 
Wymondham, Norfolk UK, 1999). A "nucleic acid fragment" is a fraction of a 

10 given nucleic acid molecule. Deoxyribonucleic acid (DNA) in the majority of 
organisms is the genetic material while ribonucleic acid (RNA) is involved in the 
transfer of information contained within DNA into proteins. The term 
"nucleotide sequence" refers to a polymer of DNA or RNA which can be single- 
or double-stranded, optionally containing synthetic, non-natural or altered 

1 5 nucleotide bases capable of incorporation into DNA or RNA polymers. The 
terms "nucleic acid", "nucleic acid molecule", "nucleic acid fragment" or 
"nucleic acid sequence or segment" may also be used interchangeably with gene, 
cDNA, DNA and RNA encoded by a gene. 

The invention encompasses isolated or substantially purified nucleic acid 

20 or protein compositions. In the context of the present invention, an "isolated" or 
"purified" DNA molecule or an "isolated" or "purified" polypeptide is a DNA 
molecule or polypeptide that, by the hand of man, exists apart from its native 
environment and is therefore not a product of nature. An isolated DNA molecule 
or polypeptide may exist in a purified form or may exist in a non-native 

25 environment such as, for example, a transgenic host celL For example, an 

"isolated" or "purified" nucleic acid molecule or protein, or biologically active 
portion thereof, is substantially free of other cellular material, or culture medium 
when produced by recombinant techniques, or substantially free of chemical 
precursors or other chemicals when chemically synthesized. In one embodiment, 

30 an "isolated" nucleic acid is free of sequences that naturally flank the nucleic 
acid (i.e., sequences located at the 5' and 3 f ends of the nucleic acid) in the 
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genomic DNA of the organism from which the nucleic acid is derived. For 
example, in various embodiments, the isolated nucleic acid molecule can contain 
less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of nucleotide 
sequences that naturally flank the nucleic acid molecule in genomic DNA of the 
5 cell from which the nucleic acid is derived. A protein that is substantially free of 
cellular material includes preparations of protein or polypeptide having less than 
about 30%, 20%, 10%, 5%, (by dry weight) of contaminating protein. When the 
protein of the invention, or biologically active portion thereof, is recombinantly 
produced, preferably culture medium represents less than about 30%, 20%, 10%, 
10 or 5% (by dry weight) of chemical precursors or non-protein-of- interest 

chemicals. Fragments and variants of the disclosed nucleotide sequences and 
proteins or partial-length proteins encoded thereby are also encompassed by the 
present invention. 

By "fragment" or "portion" is meant a full length or less than fall length 
15 of the nucleic acid sequence encoding, or the amino acid sequence of, a 

polypeptide or protein. Alternatively, fragments or portions of a nucleotide 
sequence that are useful as hybridization probes generally do not encode 
fragment proteins retaining biological activity. Thus, fragments or portions of a 
nucleotide sequence may range from at least about 6 nucleotides, about 9, about 
20 12 nucleotides, about 20 nucleotides, about 50 nucleotides, about 100 

nucleotides or more. By "portion" or "fragment", as it relates to a nucleic acid 
molecule, sequence or segment of the invention, when it is linked to other 
sequences for expression, is meant a sequence having at least 80 nucleotides, 
more preferably at least 150 nucleotides, and still more preferably at least 400 
25 nucleotides. If not employed for expressing, a "portion" or "fragment" means at 
least 6, about 9, preferably 12, more preferably 15, even more preferably at least 
20, consecutive nucleotides, e.g., probes and primers (oligonucleotides), 
corresponding to the nucleotide sequence of the nucleic acid molecules of the 
invention. 

30 By "resistant" is meant an organism, e.g., a plant or animal, that exhibits 

substantially no phenotypic changes as a consequence of infection with a 
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pathogen. By "tolerant" is meant an organism which, although it may exhibit 
some phenotypic changes as a consequence of infection, does not have a 
decreased reproductive capacity or substantially altered metabolism. 

The term "gene" is used broadly to refer to any segment of nucleic acid 
5 associated with a biological function. Thus, genes include coding sequences 

and/or the regulatory sequences required for their expression. For example, gene 
refers to a nucleic acid fragment that expresses mRNA, functional RNA, or 
specific protein, including regulatory sequences. Genes also include 
nonexpressed DNA segments that, for example, form recognition sequences for 
1 0 other proteins. Genes can be obtained from a variety of sources, including 
cloning from a source of interest or synthesizing from known or predicted 
sequence information, and may include sequences designed to have desired 
parameters. 

"Naturally occurring" is used to describe an object that can be found in 

1 5 nature as distinct from being artificially produced by man. For example, a 

protein or nucleotide sequence present in an organism (including a virus), which 
can be isolated from a source in nature and which has not been intentionally 
modified by man in the laboratory, is naturally occurring. 

A "marker gene" encodes a selectable or screenable trait. 

20 "Selectable marker" is a gene whose expression in a cell gives the cell a 

selective advantage. The selective advantage possessed by the cells transformed 
with the selectable marker gene may be due to their ability to grow in the 
presence of a negative selective agent, such as an antibiotic or a herbicide, 
compared to the growth of non-transformed cells. The selective advantage 

25 possessed by the transformed cells, compared to non-transformed cells, may also 
be due to their enhanced or novel capacity to utilize an added compound as a 
nutrient, growth factor or energy source. Selectable marker gene also refers to a 
gene or a combination of genes whose expression in a cell gives the cell both a 
negative and/or a positive selective advantage. 

30 The term "chimeric" refers to any gene or DNA that contains 1) DNA 

sequences, including regulatory and coding sequences, that are not found 
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together in nature, or 2) sequences encoding parts of proteins not naturally 
adjoined, or 3) parts of promoters that are not naturally adjoined. Accordingly, a 
chimeric gene may comprise regulatory sequences and coding sequences that are 
derived from different sources, or comprise regulatory sequences and coding 
5 sequences derived from the same source, but arranged in a manner different from 
that found in nature. 

A "transgene" refers to a gene that has been introduced into the genome 
by transformation and is stably maintained. Transgenes may include, for 
example, DNA that is either heterologous or homologous to the DNA of a 

10 particular plant to be transformed. Additionally, transgenes may comprise native 
genes inserted into a non-native organism, or chimeric genes. The term 
"endogenous gene" refers to a native gene in its natural location in the genome of 
an organism. A "foreign" gene refers to a gene not normally found in the host 
organism but that is introduced by gene transfer. 

1 5 The terms "protein," "peptide" and "polypeptide" are used 

interchangeably herein. 

By "variants" is intended substantially similar sequences. For nucleotide 
sequences, variants include those sequences that, because of the degeneracy of 
the genetic code, encode the identical amino acid sequence of the native protein. 

20 Naturally occurring allelic variants such as these can be identified with the use 
of well-known molecular biology techniques, as, for example, with polymerase 
chain reaction (PGR) and hybridization techniques. Variant nucleotide 
sequences also include synthetically derived nucleotide sequences, such as those 
generated, for example, by using site-directed mutagenesis which encode the 

25 native protein, as well as those that encode a polypeptide having amino acid 

substitutions. Generally, nucleotide sequence variants of the invention will have 
at least 40, 50, 60, to 70%, e.g., preferably 71%, 72%, 73%, 74%, 75%, 76%, 
77%, 78%, to 79%, generally at least 80%, e.g., 81%-84%, at least 85%, e.g., 
86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, to 98%, 

30 sequence identity to the native (endogenous) nucleotide sequence. 
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"DNA shuffling" is a method to introduce mutations or rearrangements, 
preferably randomly, in a DNA molecule or to generate exchanges of DNA 
sequences between two or more DNA molecules, preferably randomly. The 
DNA molecule resulting from DNA shuffling is a shuffled DNA molecule that is 
5 a non-naturally occurring DNA molecule derived from at least one template 
DNA molecule. The shuffled DNA preferably encodes a variant polypeptide 
modified with respect to the polypeptide encoded by the template DNA, and may 
have an altered biological activity with respect to the polypeptide encoded by the 
template DNA. 

1 0 The nucleic acid molecules of the invention can be optimized for 

enhanced expression in an organism of interest (Wada et al., Nucl Acids Res. , 
18:2367 (1990). For plants see, for example, EPA035472; W091/16432; Perlak 
et al., Proc. Natl. Acad. Sci. USA , 88:3324 (1991); and Murray et al, Nucl Acids 
Res. 17:477 (1989). hi this manner, the genes or gene fragments can be 

1 5 synthesized utilizing plant-preferred codons. See, for example, Campbell and 
Gowri, 1990 for a discussion of host-preferred codon usage. Thus, the 
nucleotide sequences can be optimized for expression in any plant. It is 
recognized that all or any part of the gene sequence may be optimized or 
synthetic. That is, synthetic or partially optimized sequences may also be used. 

20 Variant nucleotide sequences and proteins also encompass sequences and protein 
derived from a mutagenic and recombinogenic procedure such as DNA 
shuffling. With such a procedure, one or more different coding sequences can be 
manipulated to create a new polypeptide possessing the desired properties. In 
this manner, libraries of recombinant polynucleotides are generated from a 

25 population of related sequence polynucleotides comprising sequence regions that 
have substantial sequence identity and can be homologously recombined in vitro 
or in vivo. Strategies for such DNA shuffling are known in the art. See, for 
example, Stemmer, Nature , 370:389 (1994); Crameri et al., Nature Biotech. , 
15:436 (1997); Moore et al., JMB, 272:336 (1997); Zhang et al., Proc. Natl. 

30 Acad. Sci. USA , 94:4504 (1997); Crameri et al., Nature , 391:288 (1998); and 
U.S. Patent Nos. 5,605,793 and 5,837,458. 
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"Conservatively modified variations" of a particular nucleic acid 
sequence refers to those nucleic acid sequences that encode identical or 
essentially identical amino acid sequences, or where the nucleic acid sequence 
does not encode an amino acid sequence, to essentially identical sequences. 
5 Because of the degeneracy of the genetic code, a large number of functionally 
identical nucleic acids encode any given polypeptide. For instance the codons 
CGT, CGC, CGA, CGG, AGA, and AGG all encode the amino acid arginine. 
Thus, at every position where an arginine is specified by a codon, the codon can 
be altered to any of the corresponding codons described without altering the 

10 encoded protein. Such nucleic acid variations are "silent variations" which are 
one species of "conservatively modified variations." Every nucleic acid 
sequence described herein which encodes a polypeptide also describes every 
possible silent variation, except where otherwise noted. One of skill will 
recognize that each codon in a nucleic acid (except ATG, which is ordinarily the 

1 5 only codon for methionine) can be modified to yield a functionally identical 
molecule by standard techniques. Accordingly, each "silent variation" of a 
nucleic acid which encodes a polypeptide is implicit in each described sequence. 

"Recombinant DNA molecule" is a combination of DNA sequences that 
are joined together using recombinant DNA technology and procedures used to 

20 join together DNA sequences as described, for example, in Sambrook et al., Cold 
Spring Harbor, NY: Cold Spring Harbor Laboratory Press (1989). 

The terms "heterologous DNA sequence," "exogenous DNA segment" or 
"heterologous nucleic acid," each refer to a sequence that originates from a 
source foreign to the particular host cell or, if from the same source, is modified 

25 from its original form. Thus, a heterologous gene in a host cell includes a gene 
that is endogenous to the particular host cell but has been modified through, for 
example, the use of DNA shuffling. The terms also include non-naturally 
occurring multiple copies of a naturally occurring DNA sequence. Thus, the 
terms refer to a DNA segment that is foreign or heterologous to the cell, or 

30 homologous to the cell but in a position within the host cell nucleic acid in which 
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the element is not ordinarily found. Exogenous DNA segments are expressed to 
yield exogenous polypeptides. 

A "microarray" as used herein is a solid support and a plurality of 
different oligonucleotides attached to the support. Each of the different 
5 oligonucleotides is attached to the surface of the solid support in a different 
defined region, has a different determinable sequence, and is at least six 
nucleotides in length. Preferably, at least one of the different oligonucleotides is 
derived from a region of a polynucleotide having a nucleotide sequence selected 
from SEQ ID NO:46, SEQ ID NO:48 and SEQ ID NO:55, or the complement 
10 thereof. 

A "homologous" DNA sequence is a DNA sequence that is naturally 
associated with a host cell into which it is introduced. 

"Wild-type" refers to the normal gene, e.g., a gene found in the highest 
frequency in a particular population, or organism found in nature without any 
1 5 known mutation. 

"Genome" refers to the complete genetic material of an organism. 

"Vector" is defined to include, inter alia, any plasmid, cosmid, phage or 
binary vector in double or single stranded linear or circular form which may or 
may not be self transmissible or mobilizable, and which can transform 
20 prokaryotic or eukaryotic host either by integration into the cellular genome or 
exist extrachromosomally (e.g., autonomous replicating plasmid with an origin 
of replication). 

Specifically included are shuttle vectors by which is meant a DNA 
vehicle capable, naturally or by design, of replication in two different host 
25 organisms, which may be selected from actinomycetes and related species, 

bacteria and eukaryotic (e.g., higher plant, mammalian, yeast or fungal cells). 

"Cloning vectors" typically contain one or a small number of restriction 
endonuclease recognition sites at which foreign DNA sequences can be inserted 
in a determinable fashion without loss of essential biological function of the 
30 vector, as well as a marker gene that is suitable for use in the identification and 
selection of cells transformed with the cloning vector. Marker genes typically 
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include genes that provide tetracycline resistance, hygromycin resistance or 
ampicillin resistance. 

"Expression cassette" as used herein means a DNA sequence capable of 
directing expression of a particular nucleotide sequence in an appropriate host 
5 cell, comprising a promoter operably linked to the nucleotide sequence of 
interest which is operably linked to termination signals. It also typically 
comprises sequences required for proper translation of the nucleotide sequence. 
The coding region usually codes for a protein of interest but may also code for a 
functional RNA of interest, for example antisense RNA or a nontranslated RNA, 

10 in the sense or antisense direction. The expression cassette comprising the 

nucleotide sequence of interest may be chimeric, meaning that at least one of its 
components is heterologous with respect to at least one of its other components. 
The expression cassette may also be one which is naturally occurring but has 
been obtained in a recombinant form useful for heterologous expression. The 

1 5 expression of the nucleotide sequence in the expression cassette may be under 
the control of a constitutive promoter or of an inducible promoter which initiates 
transcription only when the host cell is exposed to some particular external 
stimulus. In the case of a multicellular organism, the promoter can also be 
specific to a particular tissue or organ or stage of development. 

20 Such expression cassettes will comprise the transcriptional initiation 

region of the invention linked to a nucleotide sequence of interest. Such an 
expression cassette is provided with a plurality of restriction sites for insertion of 
the gene of interest to be under the transcriptional regulation of the regulatory 
regions. The expression cassette may additionally contain selectable marker 

25 genes. 

A transcriptional cassette will include in the 5 ? -3 ! direction of 
transcription, a transcriptional and translational initiation region, a DNA 
sequence of interest, and a transcriptional and translational termination region 
functional in plants. The termination region may be native with the 
30 transcriptional initiation region, may be native with the DNA sequence of 
interest, or may be derived from another source. For expression in plants, 
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convenient tennination regions are available from the Ti-plasmid of A. 
tumefaciens, such as the octopine synthase and nopaline synthase termination 
regions. See also, Guerineau et aL, Mol. Gen. Genetics , 262:141 (1991); 
Proudfoot, Cell, 64:671 (1991); Sanfacon et aL, Genes Dev. . 5:141 (1991); 
5 Mogen et aL, Plant Cell 2:1261 (1990); Munroe et aL, Gene. 91:151 (1990); 
Ballas et al., Nucl. Acids Res. , 17:7891 (1989); Joshi et al., Nucl. Acids Res. , 
15:9827 (1987). 

An oligonucleotide corresponding to a nucleic acid molecule of the 
invention maybe about 30 or fewer nucleotides in length (e.g., 9, 12, 15, 18, 20, 

10 21 or 24, or any number between 9 and 30). Generally specific primers are 
upwards of 14 nucleotides in length. For optimum specificity and cost 
effectiveness, primers of 16-24 nucleotides in length may be preferred. Those 
skilled in the art are well versed in the design of primers for use processes such 
as PGR. If required, probing can be done with entire restriction fragments of the 

1 5 gene disclosed herein which may be 1 00's or even 1 000 f s of nucleotides in 
length. 

"Coding sequence" refers to a DNA or RNA sequence that codes for a 
specific amino acid sequence and excludes the non-coding sequences 5' and 3 ? 
to the coding sequence. It may constitute an "uninterrupted coding sequence", 

20 i.e., lacking an intron, such as in a cDNA or it may include one or more introns 
bounded by appropriate splice junctions, e.g., as may be found in genomic DNA. 
An "intron" is a sequence of RNA which is contained in the primary transcript 
but which is removed through cleavage and re-ligation of the RNA within the 
cell to create the mature mRNA that can be translated into a protein. 

25 The terms "open reading frame" and "ORF" refer to the amino acid 

sequence encoded between translation initiation and tennination codons of a 
coding sequence. The terms "initiation codon" and "termination codon" refer to 
a unit of three adjacent nucleotides ("codon") in a coding sequence that specifies 
initiation and chain termination, respectively, of protein synthesis (mRNA 

30 translation). 
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A "functional RNA" refers to an antisense RNA, ribozyme, or other RNA 
that is not translated. 

The term "RNA transcript" refers to the product resulting from RNA 
polymerase catalyzed transcription of a DNA sequence. When the RNA 
5 transcript is a perfect complementary copy of the DNA sequence, it is referred to 
as the primary transcript or it may be a RNA sequence derived from 
posttranscriptional processing of the primary transcript and is referred to as the 
mature RNA. "Messenger RNA" (mRNA) refers to the RNA that is without 
introns and that can be translated into protein by the cell. "cDNA" refers to a 
10 single- or a double-stranded DNA that is complementary to and derived from 
mRNA. 

"Regulatory sequences" and "suitable regulatory sequences" each refer to 
nucleotide sequences located upstream (5 f non-coding sequences), within, or 
downstream (3 f non-coding sequences) of a coding sequence, and which 

1 5 influence the transcription, RNA processing or stability, or translation of the 
associated coding sequence. Regulatory sequences include enhancers, 
promoters, translation leader sequences, introns, and polyadenylation signal 
sequences. They include natural and synthetic sequences as well as sequences 
which may be a combination of synthetic and natural sequences. As is noted 

20 above, the term "suitable regulatory sequences" is not limited to promoters. 

However, some suitable regulatory sequences useful in the present invention will 
include, but are not limited to constitutive promoters, tissue-specific promoters, 
development-specific promoters, inducible promoters and viral promoters. 
"5' non-coding sequence" refers to a nucleotide sequence located 5' 

25 (upstream) to the coding sequence. It is present in the fully processed mRNA 
upstream of the initiation codon and may affect processing of the primary 
transcript to mRNA, mRNA stability or translation efficiency (Turner et al., Mol. 
Biotech. , 3:225 (1995). 

"3' non-coding sequence" refers to nucleotide sequences located 3 T 

30 (downstream) to a coding sequence and include polyadenylation signal 

sequences and other sequences encoding regulatory signals capable of affecting 
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mRNA processing or gene expression. The polyadenylation signal is usually 
characterized by affecting the addition of polyadenylic acid tracts to the 3' end of 
the mRNA precursor. The use of different 3 ! non-coding sequences is 
exemplified by Ingelbrecht et al., Plant Cell 1, 671, 1989. 
5 "Promoter" refers to a nucleotide sequence, usually upstream (5 ! ) to its 

coding sequence, which controls the expression of the coding sequence by 
providing the recognition for RNA polymerase and other factors required for 
proper transcription. "Promoter" includes a minimal promoter that is a short 
DNA sequence comprised of a TATA- box and other sequences that serve to 

1 0 specify the site of transcription initiation, to which regulatory elements are added 
for control of expression. "Promoter" also refers to a nucleotide sequence that 
includes a minimal promoter plus regulatory elements that is capable of 
controlling the expression of a coding sequence or functional RNA. This type of 
promoter sequence consists of proximal and more distal upstream elements, the 

15 latter elements often referred to as enhancers. Accordingly, an "enhancer" is a 
DNA sequence which can stimulate promoter activity and may be an innate 
element of the promoter or a heterologous element inserted to enhance the level 
or tissue specificity of a promoter. It is capable of operating in both orientations 
(normal or flipped), and is capable of functioning even when moved either 

20 upstream or downstream from the promoter. Both enhancers and other upstream 
promoter elements bind sequence-specific DNA-binding proteins that mediate 
their effects. Promoters may be derived in their entirety from a native gene, or 
be composed of different elements derived from different promoters found in 
nature, or even be comprised of synthetic DNA segments. A promoter may also 

25 contain DNA sequences that are involved in the binding of protein factors which 
control the effectiveness of transcription initiation in response to physiological or 
developmental conditions. 

The "initiation site" is the position surrounding the first nucleotide that is 
part of the transcribed sequence, which is also defined as position +1. With 

30 respect to this site all other sequences of the gene and its controlling regions are 
numbered. Downstream sequences (i.e. further protein encoding sequences in the 
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3' direction) are denominated positive, while upstream sequences (mostly of the 
controlling regions in the 5' direction) are denominated negative. 

Promoter elements, particularly a TATA element, that are inactive or that 
have greatly reduced promoter activity in the absence of upstream activation are 
5 referred to as "minimal or core promoters." In the presence of a suitable 

transcription factor, the minimal promoter functions to permit transcription. A 
"minimal or core promoter" thus consists only of all basal elements needed for 
transcription initiation, e.g., a TATA box and/or an initiator. 

"Constitutive expression" refers to expression using a constitutive or 

10 regulated promoter. "Conditional" and "regulated expression" refer to 
expression controlled by a regulated promoter. 

"Operably-linked" refers to the association of nucleic acid sequences on 
single nucleic acid fragment so that the function of one is affected by the other. 
For example, a regulatory DNA sequence is said to be "operably linked to" or 

1 5 "associated with" a DNA sequence that codes for an RNA or a polypeptide if the 
two sequences are situated such that the regulatory DNA sequence affects 
expression of the coding DNA sequence (i.e., that the coding sequence or 
functional RNA is under the transcriptional control of the promoter). Coding 
sequences can be operably-linked to regulatory sequences in sense or antisense 

20 orientation. 

"Expression" refers to the transcription and/or translation of an 
endogenous gene or a transgene in plants. For example, in the case of antisense 
constructs, expression may refer to the transcription of the antisense DNA only. 
In addition, expression refers to the transcription and stable accumulation of 

25 sense (mRNA) or functional RNA. Expression may also refer to the production 
of protein. 

"Altered levels" refers to the level of expression in transgenic cells or 
organisms that differs from that of normal or untransformed cells or organisms. 

"Overexpression" refers to the level of expression in transgenic cells or 
30 organisms that exceeds levels of expression in normal or untransformed cells or 
organisms. 
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"Antisense inhibition" refers to the production of antisense RNA 
transcripts capable of suppressing the expression of protein from an endogenous 
gene or a transgene. 

"Co-suppression" and "transwitch" each refer to the production of sense 
5 RNA transcripts capable of suppressing the expression of identical or 

substantially similar transgene or endogenous genes (U.S. Patent No. 5,23 1,020). 

"Gene silencing" refers to homology-dependent suppression of viral 
genes, transgenes, or endogenous nuclear genes. Gene silencing maybe 
transcriptional, when the suppression is due to decreased transcription of the 
10 affected genes, or post-transcriptional, when the suppression is due to increased 
turnover (degradation) of RNA species homologous to the affected genes 
(English et al., Plant Cell, 8:179 (1996). Gene silencing includes virus-induced 
gene silencing (Ruiz et al., Plant Cell 10:937 (1998). 

"Chromosomally-integrated" refers to the integration of a foreign gene or 
1 5 DNA construct into the host DNA by covalent bonds. Where genes are not 
"chromosomally integrated" they may be "transiently expressed." Transient 
expression of a gene refers to the expression of a gene that is not integrated into 
the host chromosome but functions independently, either as part of an 
autonomously replicating plasmid or expression cassette, for example, or as part 
20 of another biological system such as a virus. 

The following terms are used to describe the sequence relationships 
between two or more nucleic acids or polynucleotides: (a) "reference sequence", 
(b) "comparison window", (c) "sequence identity", (d) "percentage of sequence 
identity", and (e) "substantial identity". 
25 (a) As used herein, "reference sequence" is a defined sequence used as a 

basis for sequence comparison. A reference sequence may be a subset or the 
entirety of a specified sequence; for example, as a segment of a full-length 
cDNA or gene sequence, or the complete cDNA or gene sequence. 

(b) As used herein, "comparison window" makes reference to a 
30 contiguous and specified segment of a polynucleotide sequence, wherein the 

polynucleotide sequence in the comparison window may comprise additions or 
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deletions (i.e., gaps) compared to the reference sequence (which does not 
comprise additions or deletions) for optimal alignment of the two sequences. 
Generally, the comparison window is at least 20 contiguous nucleotides in 
length, and optionally can be 30, 40, 50, 100, or longer. Those of skill in the art 
5 understand that to avoid a high similarity to a reference sequence due to 
inclusion of gaps in the polynucleotide sequence a gap penalty is typically 
introduced and is subtracted from the number of matches. 

Methods of alignment of sequences for comparison are well known in the 
art. Thus, the determination of percent identity between any two sequences can 

10 be accomplished using a mathematical algorithm. Preferred, non-limiting 
examples of such mathematical algorithms are the algorithm of Myers and 
Miller, CABIOS , 4:1 1 (1988); the local homology algorithm of Smith et al., 
Adv. Appl. Math. , 2:482 (1981); the homology alignment algorithm of 
Needleman and Wunsch, JMB , 48:443 (1970); the search-for-similarity-method 

1 5 of Pearson and Lipman, Proc. Natl. Acad. Sci. USA . 85:2444 (1988); the 

algorithm of Karlin and Altschul, Proc. Natl. Acad. Sci. USA , 87:2264 (1990), 
modified as in Karlin and Altschul, Proc. Natl. Acad. Sci. USA , 90:5873 (1993). 

Computer implementations of these mathematical algorithms can be 
utilized for comparison of sequences to determine sequence identity. Such 

20 implementations include, but are not limited to: CLUSTAL in the PC/Gene 

program (available from Intelligenetics, Mountain View, California); the ALIGN 
program (Version 2.0) and GAP, BESTFIT, BLAST, FASTA, and TFASTA in 
the Wisconsin Genetics Software Package, Version 8 (available from Genetics 
Computer Group (GCG), 575 Science Drive, Madison, Wisconsin, USA). 

25 Alignments using these programs can be performed using the default parameters. 
The CLUSTAL program is well described by Higgins et al., Gene , 73:237 
(1988); Higgins et al., CABIOS , 5:151 (1989); Corpet et al., Nucl. Acids Res. , 
16:10881 (1988); Huang et al., CABIOS , 8:155 (1992); and Pearson et al., Meth. 
Mol. Biol. , 24:307 (1994). The ALIGN program is based on the algorithm of 

30 Myers and Miller, supra. The BLAST programs of Altschul et al., JMB, 
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215:403 (1990); Nucl. Acids Res., 25:3389 (1990), are based on the algorithm of 
Karlin and Altschul supra. 

Software for performing BLAST analyses is publicly available through 
the National Center for Biotechnology Information 
5 (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high 
scoring sequence pairs (HSPs) by identifying short words of length W in the 
query sequence, which either match or satisfy some positive- valued threshold 
score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul et at., 1990, 

10 supra). These initial neighborhood word hits act as seeds for initiating searches 
to find longer HSPs containing them. The word hits are then extended in both 
directions along each sequence for as far as the cumulative alignment score can 
be increased. Cumulative scores are calculated using, for nucleotide sequences, 
the parameters M (reward score for a pair of matching residues; always > 0) and 

15 N (penalty score for mismatching residues; always < 0). For amino acid 

sequences, a scoring matrix is used to calculate the cumulative score. Extension 
of the word hits in each direction are halted when the cumulative alignment score 
falls off by the quantity X from its maximum achieved value, the cumulative 
score goes to zero or below due to the accumulation of one or more negative- 

20 scoring residue alignments, or the end of either sequence is reached. 

In addition to calculating percent sequence identity, the BLAST 
algorithm also performs a statistical analysis of the similarity between two 
sequences (see, e.g., Karlin & Altschul (1993), supra). One measure of 
similarity provided by the BLAST algorithm is the smallest sum probability 

25 (P(N)), which provides an indication of the probability by which a match 

between two nucleotide or amino acid sequences would occur by chance. For 
example, a test nucleic acid sequence is considered similar to a reference 
sequence if the smallest sum probability in a comparison of the test nucleic acid 
sequence to the reference nucleic acid sequence is less than about 0.1, more 

30 preferably less than about 0.01, and most preferably less than about 0.001 . 
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To obtain gapped alignments for comparison purposes, Gapped BLAST 
(in BLAST 2.0) can be utilized as described in Altschul et al., 1997. 
Alternatively, PSI-BLAST (in BLAST 2.0) can be used to perform an iterated 
search that detects distant relationships between molecules. See Altschul et al. 5 
5 supra. When utilizing BLAST, Gapped BLAST, PSI-BLAST, the default 

parameters of the respective programs (e.g. BLASTN for nucleotide sequences, 
BLASTX for proteins) can be used. The BLASTN program (for nucleotide 
sequences) uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, a 
cutoff of 100, M=5, N=-4, and a comparison of both strands. For amino acid 

1 0 sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an 
expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & 
Henikoff, 1989). See http://www.ncbi.nlm.nih.gov . Alignment may also be 
performed manually by inspection. 

For purposes of the present invention, comparison of nucleotide 

1 5 sequences for determination of percent sequence identity to the sequences 

disclosed herein is preferably made using the BlastN program (version 1 .4.7 or 
later) with its default parameters or any equivalent program. By "equivalent 
program" is intended any sequence comparison program that, for any two 
sequences in question, generates an alignment having identical nucleotide or 

20 amino acid residue matches and an identical percent sequence identity when 
compared to the corresponding alignment generated by the preferred program. 

(c) As used herein, "sequence identity" or "identity" in the context of two 
nucleic acid or polypeptide sequences makes reference to a specified percentage 
of residues in the two sequences that are the same when aligned for maximum 

25 correspondence over a specified comparison window, as measured by sequence 
comparison algorithms or by visual inspection. When percentage of sequence 
identity is used in reference to proteins it is recognized that residue positions 
which are not identical often differ by conservative amino acid substitutions, 
where amino acid residues are substituted for other amino acid residues with 

30 similar chemical properties (e.g., charge or hydrophobicity) and therefore do not 
change the functional properties of the molecule. When sequences differ in 
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conservative substitutions, the percent sequence identity may be adjusted 
upwards to correct for the conservative nature of the substitution. Sequences that 
differ by such conservative substitutions are said to have "sequence similarity" or 
"similarity." Means for making this adjustment are well known to those of skill 
5 in the art. Typically this involves scoring a conservative substitution as a partial 
rather than a full mismatch, thereby increasing the percentage sequence identity. 
Thus, for example, where an identical amino acid is given a score of 1 and a non- 
conservative substitution is given a score of zero, a conservative substitution is 
given a score between zero and 1 . The scoring of conservative substitutions is 
10 calculated, e.g., as implemented in the program PC/GENE (Intelligenetics, 
Mountain View, California). 

(d) As used herein, "percentage of sequence identity" means the value 
determined by comparing two optimally aligned sequences over a comparison 
window, wherein the portion of the polynucleotide sequence in the comparison 

15 window may comprise additions or deletions (i.e., gaps) as compared to the 

reference sequence (which does not comprise additions or deletions) for optimal 
alignment of the two sequences. The percentage is calculated by determining the 
number of positions at which the identical nucleic acid base or amino acid 
residue occurs in both sequences to yield the number of matched positions, 

20 dividing the number of matched positions by the total number of positions in the 
window of comparison, and multiplying the result by 100 to yield the percentage 
of sequence identity. 

(e) (i) The term "substantial identity" of polynucleotide sequences means 
that a polynucleotide comprises a sequence that has at least 70%, 71%, 72%, 

25 73%>, 74%, 75%, 76%, 77%, 78%, or 79%, preferably at least 80%, 81%, 82%, 
83%o, 84%, 85%, 86%, 87%, 88%, or 89%, more preferably at least 90%, 91%, 
92%,, 93%, or 94%, and most preferably at least 95%, 96%, 97%, 98%, or 99% 
sequence identity, compared to a reference sequence using one of the alignment 
programs described using standard parameters. One of skill in the art will 

30 recognize that these values can be appropriately adjusted to determine 

corresponding identity of proteins encoded by two nucleotide sequences by 
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taking into account codon degeneracy, amino acid similarity, reading frame 
positioning, and the like. Substantial identity of amino acid sequences for these 
purposes normally means sequence identity of at least 70%, more preferably at 
least 80%, 90%, and most preferably at least 95%. 
5 Another indication that nucleotide sequences are substantially identical is 

if two molecules hybridize to each other under stringent conditions (see below). 
Generally, stringent conditions are selected to be about 5°C lower than the 
thermal melting point (T m ) for the specific sequence at a defined ionic strength 
and pH. However, stringent conditions encompass temperatures in the range of 

10 about 1°C to about 20°C, depending upon the desired degree of stringency as 
otherwise qualified herein. Nucleic acids that do not hybridize to each other 
under stringent conditions are still substantially identical if the polypeptides they 
encode are substantially identical. This may occur, e.g., when a copy of a nucleic 
acid is created using the maximum codon degeneracy permitted by the genetic 

1 5 code. One indication that two nucleic acid sequences are substantially identical 
is when the polypeptide encoded by the first nucleic acid is immunologically 
cross reactive with the polypeptide encoded by the second nucleic acid. 

(e)(ii) The term "substantial identity" in the context of a peptide indicates 
that a peptide comprises a sequence with at least 70%, 71%, 72%, 73%, 74%, 

20 75%, 76%, 77%, 78%, or 79%, preferably 80%, 81%, 82%, 83%, 84%, 85%, 
86%, 87%, 88%, or 89%, more preferably at least 90%, 91%, 92%, 93%, or 
94%, or even more preferably, 95%, 96%, 97%, 98% or 99%, sequence identity 
to the reference sequence over a specified comparison window. Preferably, 
optimal alignment is conducted using the homology alignment algorithm of 

25 Needleman and Wunsch, 1970, supra. An indication that two peptide sequences 
are substantially identical is that one peptide is immunologically reactive with 
antibodies raised against the second peptide. Thus, a peptide is substantially 
identical to a second peptide, for example, where the two peptides differ only by 
a conservative substitution. 

30 For sequence comparison, typically one sequence acts as a reference 

sequence to which test sequences are compared. When using a sequence 
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comparison algorithm, test and reference sequences are input into a computer, 
subsequence coordinates are designated if necessary, and sequence algorithm 
program parameters are designated. The sequence comparison algorithm then 
calculates the percent sequence identity for the test sequence(s) relative to the 
5 reference sequence, based on the designated program parameters. 

As noted above, another indication that two nucleic acid sequences are 
substantially identical is that the two molecules hybridize to each other under 
stringent conditions. The phrase "hybridizing specifically to" refers to the 
binding, duplexing, or hybridizing of a molecule only to a particular nucleotide 

1 0 sequence under stringent conditions when that sequence is present in a complex 
mixture (e.g., total cellular) DNA or RNA. "Bind(s) substantially" refers to 
complementary hybridization between a probe nucleic acid and a target nucleic 
acid and embraces minor mismatches that can be accommodated by reducing the 
stringency of the hybridization media to achieve the desired detection of the 

1 5 target nucleic acid sequence. 

"Stringent hybridization conditions" and "stringent hybridization wash 
conditions" in the context of nucleic acid hybridization experiments such as 
Southern and Northern hybridizations are sequence dependent, and are different 
under different environmental parameters. Longer sequences hybridize 

20 specifically at higher temperatures. The T m is the temperature (under defined 
ionic strength and pH) at which 50% of the target sequence hybridizes to a 
perfectly matched probe. Specificity is typically the function of post- 
hybridization washes, the critical factors being the ionic strength and temperature 
of the final wash solution. For DNA-DNA hybrids, the T m can be approximated 

25 from the equation of Meinkoth and Wahl, 1984; T m 81.5°C + 16.6 (log M) +0.41 
(%GC) - 0.61 (%-form) - 500/L; where M is the molarity of monovalent cations, 
%GC is the percentage of guanosine and cytosine nucleotides in the DNA, % 
form is the percentage of formamide in the hybridization solution, and L is the 
length of the hybrid in base pairs. T m is reduced by about 1 °C for each 1% of 

30 mismatching; thus, T m , hybridization, and/or wash conditions can be adjusted to 
hybridize to sequences of the desired identity. For example, if sequences with 
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>90% identity are sought, the T m can be decreased 10°C. Generally, stringent 
conditions are selected to be about 5°C lower than the thermal melting point (T m ) 
for the specific sequence and its complement at a defined ionic strength and pH. 
However, severely stringent conditions can utilize a hybridization and/or wash at 
5 1, 2, 3, or 4°C lower than the thermal melting point (T m ); moderately stringent 
conditions can utilize a hybridization and/or wash at 6, 7, 8, 9, or 10°C lower 
than the thermal melting point (T m ); low stringency conditions can utilize a 
hybridization and/or wash at 11, 12, 13, 14, 15, or 20°C lower than the thermal 
melting point (T m ). Using the equation, hybridization and wash compositions, 

10 and desired T, those of ordinary skill will understand that variations in the 

stringency of hybridization and/or wash solutions are inherently described. If the 
desired degree of mismatching results in a T of less than 45°C (aqueous solution) 
or 32°C (formamide solution), it is preferred to increase the SSC concentration 
so that a higher temperature can be used. An extensive guide to the 

1 5 hybridization of nucleic acids is found in Tijssen, 1993. Generally, highly 

stringent hybridization and wash conditions are selected to be about 5°C lower 
than the thermal melting point (T m ) for the specific sequence at a defined ionic 
strength and pH. 

Very stringent conditions are selected to be equal to the T m for a 

20 particular probe. An example of stringent conditions for hybridization of 

complementary nucleic acids which have more than 100 complementary residues 
on a filter in a Southern or Northern blot is 50% formamide, e.g., hybridization 
in 50% formamide, 1 M NaCl, 1% SDS at 37°C, and a wash in 0. IX SSC at 60 
to 65 °C. Exemplary low stringency conditions include hybridization with a 

25 buffer solution of 30 to 35% formamide, 1 M NaCl, 1% SDS (sodium dodecyl 
sulphate) at 37°C, and a wash in IX to 2X SSC (20X SSC = 3.0 M NaCl/0.3 M 
trisodium citrate) at 50 to 55 °C. Exemplary moderate stringency conditions 
include hybridization in 40 to 45% formamide, 1.0 M NaCl, 1% SDS at 37°C, 
and a wash in 0.5X to IX SSC at 55 to 60°C. 

30 The following are examples of sets of hybridization/wash conditions that 

may be used to clone orthologous nucleotide sequences that are substantially 
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identical to reference nucleotide sequences of the present invention; a reference 
nucleotide sequence preferably hybridizes to the reference nucleotide sequence 
in 7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C with 
washing in 2X SSC, 0.1% SDS at 50°C, more desirably in 7% sodium dodecyl 
5 sulfate (SDS), 0.5 M NaP0 45 1 mM EDTA at 50°C with washing in IX SSC, 
0.1% SDS at 50°C 5 more desirably still in 7% sodium dodecyl sulfate (SDS), 0.5 
M NaP0 4 , 1 mM EDTA at 50°C with washing in 0.5X SSC, 0.1% SDS at 50°C, 
preferably in 7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 
50°C with washing in 0. IX SSC, 0. 1 % SDS at 50°C, more preferably in 7% 

10 sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50 °C with 
washing in 0.1X SSC, 0.1% SDS at 65°C. 

By "variant" polypeptide is intended a polypeptide derived from the 
native protein by deletion (so-called truncation) or addition of one or more 
amino acids to the N-terminal and/or C-terminal end of the native protein; 

1 5 deletion or addition of one or more amino acids at one or more sites in the native 
protein; or substitution of one or more amino acids at one or more sites in the 
native protein. Such variants may results form, for example, genetic 
polymorphism or from human manipulation. Methods for such manipulations 
are generally known in the art. 

20 Thus, the polypeptides of the invention may be altered in various ways 

including amino acid substitutions, deletions, truncations, and insertions. 
Methods for such manipulations are generally known in the art. For example, 
amino acid sequence variants of the polypeptides can be prepared by mutations 
in the DNA. Methods for mutagenesis and nucleotide sequence alterations are 

25 well known in the art. See, for example, Kunkel, Proc. Natl. Acad. Sci. USA , 
82:488 (1985); Kunkel et al., Meth. EnzymoL , 154:367 (1987); U. S. Patent No. 
4,873,192; Walker and Gaastra, Techniques in Mol. Biol. (MacMillan Publishing 
Co. (1983), and the references cited therein. Guidance as to appropriate amino 
acid substitutions that do not affect biological activity of the protein of interest 

30 may be found in the model of Dayhoff et al., Atlas of Protein Sequence and 
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Structure (Natl. Biomed. Res. Found. 1978). Conservative substitutions, such as 
exchanging one amino acid with another having similar properties, are preferred. 

Thus, the genes and nucleotide sequences of the invention include both 
the naturally occurring sequences as well as mutant forms. Likewise, the 
5 polypeptides of the invention encompass both naturally occurring proteins as 
well as variations and modified forms thereof. Such variants will continue to 
possess the desired activity. The deletions, insertions, and substitutions of the 
polypeptide sequence encompassed herein are not expected to produce radical 
changes in the characteristics of the polypeptide. However, when it is difficult to 

1 0 predict the exact effect of the substitution, deletion, or insertion in advance of 

doing so, one skilled in the art will appreciate that the effect will be evaluated by 
routine screening assays. 

Individual substitutions deletions or additions that alter, add or delete a 
single amino acid or a small percentage of amino acids (typically less than 5%, 

15 more typically less than 1%) in an encoded sequence are "conservatively 

modified variations," where the alterations result in the substitution of an amino ' 
acid with a chemically similar amino acid. Conservative substitution tables 
providing functionally similar amino acids are well known in the art. The 
following five groups each contain amino acids that are conservative 

20 substitutions for one another: Aliphatic: Glycine (G), Alanine (A), Valine (V), 
Leucine (L), Isoleucine (I); Aromatic: Phenylalanine (F), Tyrosine (Y), 
Tryptophan (W); Sulfur-containing: Methionine (M), Cysteine (C); Basic: 
Arginine (R), Lysine (K), Histidine (H); Acidic: Aspartic acid (D), Glutamic acid 
(E), Asparagine (N), Glutamine (Q). See also, Creighton, 1984. In addition, 

25 individual substitutions, deletions or additions which alter, add or delete a single 
amino acid or a small percentage of amino acids in an encoded sequence are also 
"conservatively modified variations." 

"Germline cells" refer to cells that are destined to be gametes and whose 
genetic material is heritable. 

30 The word "plant" refers to any plant, particularly to seed plant, and "plant 

cell" is a structural and physiological unit of the plant, which comprises a cell 
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wall but may also refer to a protoplast. The plant cell may be in form of an 
isolated single cell or a cultured cell, or as a part of higher organized unit such 
as 5 for example, a plant tissue, or a plant organ. 

"Plant tissue" includes differentiated and undifferentiated tissues or 
5 plants, including but not limited to roots, stems, shoots, leaves, pollen, seeds, 
tumor tissue and various forms of cells and culture such as single cells, 
protoplast, embryos, and callus tissue. The plant tissue may be in plants or in 
organ, tissue or cell culture. 

The tenn "altered plant trait" means any phenotypic or genotypic change 

10 in a transgenic plant relative to the wild-type or non-transgenic plant host. 

The term "transformation" refers to the transfer of a nucleic acid 
fragment into the genome of a host cell, resulting in genetically stable 
inheritance. Host cells containing the transformed nucleic acid fragments are 
referred to as "transgenic" cells, and organisms comprising transgenic cells are 

15 referred to as "transgenic organisms". Examples of methods of transformation 
of plants and plant cells include Agrobacterium-mediated transformation (De 
Blaere et al., Meth. Enzvmol. , 143:277 (1987) and particle bombardment 
technology (Klein et al., Nature , 327:70 (1987); U.S. Patent No. 4,945,050). 
Whole plants may be regenerated from transgenic cells by methods well known 

20 to the skilled artisan (see, for example, Fromm et al., Biotech. , 8:833 (1990). 

"Transformed," "transgenic," and "recombinant" refer to a host cell or 
organism such as a bacterium or a plant into which a heterologous nucleic acid 
molecule has been introduced. The nucleic acid molecule can be stably 
integrated into the genome generally known in the art and are disclosed in 

25 Sambrook et al., 1989, supra. See also Innis et al., PCR Protocols, Academic 
Press (1995); and Gelfand, PCR Strategies , Academic Press (1995); and Innis 
and Gelfand, PCR Methods Manual , Academic Press (1999). Known methods 
of PCR include, but are not limited to, methods using paired primers, nested 
primers, single specific primers, degenerate primers, gene-specific primers, 

30 vector-specific primers, partially mismatched primers, and the like. For 

example, "transformed," "transformant," and "transgenic" plants or calli have 
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been through the transformation process and contain a foreign gene integrated 
into their chromosome. The term '^transformed" refers to normal plants that 
have not been through the transformation process. 

A "transgenic" organism is an organism having one or more cells that 
5 contain an expression vector. 

"Transiently transformed" refers to cells in which transgenes and foreign 
DNA have been introduced but not selected for stable maintenance. 

"Stably transformed" refers to cells that have been selected and 
regenerated on a selection media following transformation. 

10 "Genetically stable" and "heritable" refer to chromosomally-integrated 

genetic elements that are stably maintained in the plant and stably inherited by 
progeny through successive generations. 

"Enzyme activity" means herein the ability of an enzyme to catalyze the 
conversion of a substrate into a product. A substrate for the enzyme comprises 

15 the natural substrate of the enzyme but also comprises analogues of the natural 
substrate which can also be converted by the enzyme into a product or into an 
analogue of a product. The activity of the enzyme is measured for example by 
determining the amount of product in the reaction after a certain period of time, 
or by determining the amount of product in the reaction after a certain period of 

20 time, or by determining the amount of substrate remaining in the reaction 
mixture after a certain period of time. The activity of the enzyme is also 
measured by determining the amount of an unused co-factor of the reaction 
remaining in the reaction mixture after a certain period of time or by determining 
the amount of used co-factor in the reaction mixture after a certain period of 

25 time. The activity of the enzyme is also measured by determining the amount of 
a donor of free energy or energy-rich molecule (e.g., ATP, phosphoenolpyruvate, 
acetyl phosphate or phosphocreatine) remaining in the reaction mixture after a 
certain period of time or by determining the amount of a used donor of a free 
energy or energy-rich molecule (e.g., ADP, pyruvate, acetate or creatine) in the 

30 reaction mixture after a certain period of time. 
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"Fungicide" is a chemical substance used to kill or suppress the growth 
of fungal cells. 

An "inhibitor" is a chemical substance that causes abnormal growth, e.g., 
by inactivating the enzymatic activity of a protein such as biosynthetic enzyme, 
5 receptor, signal transduction protein, structural gene product, or transport protein 
that is essential to the growth or survival, or alters the virulence or pathogenicity, 
of the fungus. In the context of the instant invention, an inhibitor is a chemical 
substance that alters the activity encoded by any one of SEQ ID NO:47, SEQ ID 
NO:49, SEQ ID NO:56 or their orthologs. 
1 0 "Isogenic" fungi are genetically identical, except that they may differ by 

the presence or absence of a heterologous DNA sequence. 

A "substrate" is the molecule that an enzyme naturally recognizes and 
converts to a product in the biochemical pathway in which the enzyme naturally 
carries out its function, or is a modified version of the molecule, which is also 
1 5 recognized by the enzyme and is converted by the enzyme to a product in an 
enzymatic reaction similar to the naturally-occurring reaction. 

"Tolerance" as used herein is the ability of an organism, e.g., a fungus, to 
continue essentially normal growth or function when exposed to an inhibitor or 
fungicide in an amount sufficient to suppress the normal growth or function of 
20 native, unmodified fungi. 

The Nucleic Acid Molecules of the Invention and Uses Thereof 

The involvement of peptide synthetase genes in fungal pathogenesis to 
plants has been genetically tested only in two previous studies. In C. carbonum, 

25 disruption of both copies of the HTS1 gene, which encodes HC-toxin synthetase, 
caused loss of ability to make HC-toxin and the fungus became nonpathogenic 
on HC-toxin sensitive com plants (Panaccione et al, PNAS, 89, 6590, 1992), 
indicating that the HC-toxin synthetase gene is a pathogenicity determinant. In 
Fusarium avenaceum, the enniatin-nonproducing transformants were obtained by 

30 disruption of enniatin synthetase encoding gene (esynl) and these transformants 
displayed significantly reduced virulence in a potato tuber tissue assay 
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(Herrmann et al., 1996) indicating that enniatin synthetase gene is a virulence 
factor in pathogenesis by the fungus. In these two pathosystems, only one fungal 
secondary metabolite (the peptide toxin) was studied. In contrast, the polyketide 
T-toxin has been well studied in C. heterostrophs and has been confirmed to be a 
5 host-specific virulence factor (Yoder and Turgeon, 1996; Yoder et al., 1997, 
supra) and this study demonstrated that a second secondary metabolite, the 
hypothetical CP SI toxin is also involved in pathogenesis by the fungus. Unlike 
the T-toxin biosynthetic genes such as PKS1 and DEC1 that are found only in 
race T (Yang et al., 1996, supra; Rose et al., 1996, supra), CPS1 is found in both 

10 race O and race T. Disruption of CPS1 in either race causes dramatically reduced 
fungal virulence as tested on N-cytoplasm corn. This result suggests that CP SI 
toxin could be the same as the "race O" toxin proposed previously (Yoder, 
1981). However, as disclosed herein, CPS1 is a CoA ligase. 

Interestingly, a Tox*, cpsl~ mutant also show reduced virulence on T- 

15 cytoplasm corn although it produced the same amount of T-toxin as wild type 
race T. This is unusual because the interaction between T-toxin and the T-corn- 
unique URF13 protein is highly specific; the same outcomes should be expected 
if two strains that produce the same amount of T-toxin attack the same host, T- 
corn. The most likely explanation for this result is that the fungal growth in 

20 planta has been inhibited by the host plant and the poor growth results in 
reduced T-toxin production which is normal when the fungus is grown in 
culture. Reduced virulence on T-cytoplasm corn is due to the reduced T-toxin 
production as that seen in leaky Tox mutants. This inhibition of growth could be 
due to the failure of suppression of the host defense mechanism by the fungus, 

25 which is mediated by the CPS1 controlled peptide toxin. A cpsT mutant that 
fails to produce this "suppresser" could not be able to colonize plant tissues as 
vigorously as wild type does, resulting in the reduced ability to cause disease as 
indicated by the smaller lesion phenotype. If this turns out to be the case, CPS1 
should be considered as a general virulence factor as proposed for enniatin. 

30 It is possible that cpsl" mutants are still be able to produce a certain 

amount of CPS 1 toxin. One probability is the gene has not been completely 
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inactivated by insertional mutagenesis or targeted disruption. The original REMI 
insertion occurred at core sequence 1 of CPS1 A, a region that might be not 
critical (function of core 1 is unknown). The second targeted site is located 
between cores 1 and 2 of CPS IB and the third is located between cores 2 and 3 
5 of the same module. All three insertions do not disrupt critical motifs. On the 
other hand, CPS1 contains a number of in-frame start codons and some of them 
are located immediately downstream of these insertion sites. It is possible that 
each of these disruptions actually resulted in two subtranscripts, one is 
transcribed normally from the start codon of CPS1 and stops at the insertion site 

10 and second is transcribed near one of these in-frame ATGs downstream of the 
insertion site and stops at the end of CPS1. Both transcripts could give a 
truncated protein that still has enzymatic activities. But these separate enzymes 
might have affinities for their substrates lower than that of holoenzyme. The 
reduced production of CPS1 toxin might be due to the CPS1 holoenzyme having 

1 5 been split into two fractions by the vector insertion and the resulting truncated 
proteins being much less active than the original polypeptide. This hypothesis 
can be tested by construction a C heterostrophus strain in which the entire CPS1 
encoding sequence has been deleted. 

The second possibility is the existence of multiple copies of CPS1 in the 

20 genome. Previous studies have demonstrated that the gene encoding HC-toxin 
synthetase (HTS1) is duplicated in the genome and both copies (HTS1-1 and 
HTS1-2) are 270 kb apart in most Tox2 + isolates of C. carbonum (Arm and 
Walton, Plant Cell. 8, 887, 1996). Disruption of either copy reduced HTS1 
activity but did not affect HC-toxin production; when both copies were 

25 



51 



WO 02/42444 



PCT/US01/43381 



disrupted, HC-toxin production was abolished (Panaccione et al, 1992, supra). 
But in contrast to the case of HTS1, gel blot analysis does not indicate the 
presence of a second copy of CPS1 and disruption of CPS1 does affect the 
production of the putative toxin. It is unlikely that two genes with similar 
5 organization are in the genome. An alternative postulation is that there may be a 
second gene which encodes a protein with the same enzyme activity as CPS1 but 
does not have significant sequence homology to CPSL This hypothesis is hard 
to test unless this gene is clustered with CPS1 and can be recovered by 
chromosome walking. 

10 Pathogenesis by C. heterostrophus to corn involves at least two 

secondary metabolites: the T-toxin, a host specific factor which determines high 
virulence on a particular host, T-corn and the hypothetical CPS1 toxin, a general 
factor (either virulence or pathogenicity factor) which contributes to basic 
mechanisms underlying the disease establishment by the fungus in common host 

15 plants. 

By genomic DNA hybridization, C. heterostrophus CPS1 homologs were 
found in 16 additional fungal species belonging to 5 genera. Hybridization 
signals for some were as strong as the C. heterostrophus gene, indicating that 
CPS1 is highly conserved among these fungi. This conservation appears to 
20 match the taxonomic relationships between these species. Cochliobolus 

(anamorph Bipolaris ) and Setosphaeria (anamorph Exserohilum) are closely 
related genera. 

Two species, C. victoriae and C. carbonum, which are able to cross to 
each other and thus may not be different species (Scheffer et al., 1967; Yoder et 

25 al., 1989), showed the same hybridization pattern to CPSL B. saccharic the 
closest asexual relative of C. heterostrophus, hybridized to two Hindlll 
fragments that were only seen in C. heterostrophus itself, but all other species 
gave only one distinct polymorphic band. Phylogenetic analyses using the 
internal transcribed spacer (ITS) sequences and fragments of the GPD (vanWert 

30 and Yoder, 1992) and MAT genes (Turgeon et al., Mol. Gen. Genet. , 238, 270, 
1993) also put C. victoriae/ C. carbonum and C. heterostrophus/ B. sacchari 
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closest to each other (Turgeon and Berbee, 1997). These results might imply 
that CPS J has coevolved with these genes. 

The genera Cochliobolus and Setosphaeria include many plant 
pathogenic species that are commonly associated with leaf spots or blights, 
5 mainly on cultivated cereals and wild grasses (Sivanesan, 1987; Alcorn, 1988). 
This group of phytopathogenic fungi includes both mild pathogens and severe 
pathogens that often produce host-specific toxins (Y oder, 1980, supra). One of 
the essential questions is whether or not the various diseases on diverse host 
plants caused by these fungi involve common factors or depend only on 

1 0 individual specific factors, such as host-specific toxins. 

Previous studies have shown that host-specific toxins can be critical 
factors for determining either virulence or host-range, but they do not account for 
general pathogenicity since they are produced only by certain isolates in the 
species and the corresponding biosynthetic genes are found only in these toxin- 

15 producing isolates (Yoder et al., 1997, supra). In contrast, CPS! homologs are 
found in all Cochliobolus and Setosphaeria species tested so far, suggesting they 
are a common factor shared by this group. Disruption of the CPS1 homolog in 
the oat pathogen C. victoriae caused dramatically reduced virulence to victorin- 
susceptible oats although the transformants produced wild type levels of victorin. 

20 This result is similar to that with C. heterostrophus race T, in which cpsT 
disruptants still produced wild type levels of T-toxin but showed reduced 
virulence on T-cytoplasm corn. These results argue strongly that host-specific 
toxins alone are not sufficient in determining the ultimate outcome of 
fiingus/plant interactions and suggest that the establishment of disease by these 

25 fungi also requires CPS1, which might control a pathway for general 
pathogenicity. 

In the early 1990s, studies on pathogenesis by uropathogenic E. coli led 
to the identification of pathogenicity gene clusters, termed "pathogenicity 
islands" (Hecker et al., 1990; Blum et al., 1994). Subsequently, similar gene 
30 clusters were identified in additional animal or human bacterial pathogens, 
including Yersinia pestis, Helicobacter pylori and Salmonella typhimurium. 
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These islands often contain genes for production of toxins or genes encoding 
proteins that are capable of interacting with host defense factors or required for 
type III secretion systems that deliver virulence proteins into host cells. Usually, 
they are found only in pathogenic strains (or species); in rare cases, they occur in 
5 nonpathogenic strains of the same species or related species (Hacker et al., Mol. 
Microbiol. , 23, 1089, 1997). 

In phytopathogenic bacteria, hrp gene clusters have been referred to as 
"pathogenicity islands" because they have several features in common with 
"pathogenicity islands" in animal pathogenic bacteria, i.e., they are found only in 

1 0 pathogenic species (required for plant pathogenicity) and contain highly 
conserved genes (hrc genes) defining the type III protein secretion system 
(Alfano and Collmer, 1996; Barinaga, 1996). 

In plant pathogenic fungi, genes or gene clusters with characteristics of 
"pathogenicity islands" have been identified from certain species, i.e., mNectria 

1 5 haematococca, the PDA genes for detoxifying the pea phytoalexin and other pea 
pathogenicity genes (PEP) are located on dispensable chromosomes that are 
found in all isolates pathogenic to pea but usually absent in all nonpathogenic 
isolates (VanEtten et al., Antonie Van Leeuwenhoek . 65, 263, 1994; Liu et al., 
1997, supra). In the genus Cochliobolus, the Tox2 gene cluster controlling the 

20 biosynthesis of HC-toxin is found only in C. carhonum race 1 (pathogenic to 

hmlhml corn) and the Toxl genes controlling T-toxin production are found only 
in C. heterostrophus race T (highly virulent on T-cytoplasm corn); all other races 
of the same species and all other fungal species tested so far lack these Tox genes 
(Ahn and Walton, 1996, supra; Yang et al., 1996, supra; Yoder et al., 1997, 

25 supra). 

CP SI differs in two important ways compared to these fungal 
"pathogenicity islands". First, it is highly conserved among several 
phytopathogenic Cochliobolus species and relatives. Second, like certain 
bacterial "pathogenicity islands", CPS1 also has homologs in "nonpathogenic" 
30 species. C. homomorphus and C. dactyloctenii, neither of which causes disease 
on plants, hybridized strongly to CPSL This may reflect genetic changes in the 



54 



WO 02/42444 



PCT/US01/43381 



"pathogenicity island" that resulted in loss of pathogenicity. In the bacterial 
genus Listeria, which includes several human or animal pathogenic species 
harboring highly conserved "pathogenicity islands", the "pathogenicity island" 
homolog in the nonpathogenic species (L. seeligeri) was found to be "silent" due 
5 to a mutation that occurred in the promoter region of a critical regulatory gene in 
the cluster (Hacker et al., 1997, supra). These features suggest that the CPS1 
gene cluster and homologs could define a new group of fungal "pathogenicity 
islands". 

It is known that the evolution of pathogenicity involves two major 

10 processes. A pathogenic microorganism could originate from nonpathogenic 
progenitors by slow modifications (such as point mutations and genetic 
recombination) of genes that were adapted for parasitic growth on hosts or by the 
integration of large fragments of "alien" DNA into the genome that enable the 
recipient to attack particular hosts (gene horizontal transfer). The latter can 

1 5 occur in the recent or distant evolutionary past. Subsequent vertical transmission 
in the lineage (if the transferred gene is stable in the recipient genome) would 
result in the preserve of the gene in all species that diverged after the acquisition 
of the gene(s) (Scheffer, 1991; Arber, Gene , 135, 49, 1993; Krishnapillai, 1996; 
Burdon and Silk, 1997). 

20 In the past few years, substantial evidence has become available that 

supports the hypothesis of gene horizontal transfer. All "pathogenicity islands" 
in animal pathogenic bacteria are believed to have been acquired by a horizontal 
transfer event (recent or past) because they usually differ in G+C content from 
the recipient genome and have transposable elements at the boundaries of the 

25 gene clusters (Hacker et al., 1997, supra). The hrp "pathogenicity islands" do 
not show a significant difference in G+C content or association with 
transposable elements, but they are also believed to have arisen similarly because 
hrc genes in these "pathogenicity islands" show high similarity to genes defining 
the type III protein secretion system found in animal pathogenic bacteria as 

30 mentioned above (Alfano and Collmer, 1996; and Barinaga, 1996). 
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Although CPS1 itself has several typical fungal introns and a G+C 
content (51.5%) similar to most known fungal genes, genomic regions (about 1.5 
kb) flanking the gene have higher G+C content (>60%). Several short G+C-rich 
regions are also found in the gene cluster; one of the open reading frames 
5 (ORF10) has a 63.6 % G+C content. Compared to those filamentous fungal 
genomes characterized so far, including N. crassa, A, nidulans, U. maydis (all 
have G+C content 51-54%, see Karlin and Mrazek, PNAS, 94, 10227, 1997), the 
genomic region around CPS! is unusual. This might suggest that the gene 
cluster harboring CPS1 came from a bacterial source (since most bacterial genes 
10 are known to have a high G+C content), but has evolved into a fungal version. 

Based on these data, CPS1 homologs may have a common ancestral gene 
which was acquired from a bacterial species via horizontal transfer and then 
maintained by the fungal genome via vertical transmission in closely related 
lineages. 

15 In the evolution process, the genus Cochliobolus could also have 

inherited a second gene (X) controlling the ability to take up foreign DNA, by 
which its ancestor took the "alien" CPS1. As a result, this group of fungi is able 
to keep trapping genes from other organisms by additional "horizontal transfers" 
and giving rise to new races or even new species characterized by the ability to 

20 produce unique pathogenesis factors. The direct support for this hypothesis is 

that both the Tox2 locus of C. carbonum and the Toxl locus of C. heterostrophus 
are associated with large fragments of "alien" DNA (A+T-rich and highly 
repeated) and the same could also be true for Tox3 controlling Victoria 
production by C. victoriae, although there is yet no direct experimental evidence 

25 (Ahn and Walton, 1996, supra; Yang et al., 1996, supra; Yoder et al., 1997, 
supra). In contrast to CPS1, these gene transfers must have occurred in the 
recent evolutionary past because both Toxl and Toxl loci are found only in 
specific isolates in the species, e.g., the acquisition of Toxl genes probably 
occurred as recently as the 1960s when race T was first identified in the field 

30 (Yoder et al., 1997, supra). 
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There are other possibilities for the evolution of CPSL First, each genus 
mentioned above could have acquired CPS1 independently after divergence of 
the lineage. But this seems less likely because this would need to happen at the 
same time and involve the same donor organism if the fact that the homologs 
5 detected in Cochliobolus and Setosphaeria gave similar hybridization signal 
intensity is considered. Second, the horizontal transfer of CPS1 could have 
occurred at earlier time periods such as before the divergence of Pleosporales or 
even the Ascomycotina. To test these hypotheses, detection of CPS1 homologs 
in Pyrenophora , Pleospora and other genera must be done by either genomic 

1 0 DNA hybridization or PCR. Based on the facts discussed here, it is not 

unreasonable to predict that additional CPS1 homologs will be found in other 
fungal species. Further investigation could provide an direct entry point for 
understanding the evolution of fungal pathogenesis to plants. 

The C, heterostrophus CPS1 gene was cloned by identification of 

1 5 genomic DNA fragments recovered from the tagged site in a mutant generated 
using REMI insertional mutagenesis. Characterization of two overlapping 
cosmid clones in this study has proved that no deletions or chromosome 
rearrangements are associated with the gene tagging event, because both cosmids 
carry the same fragment which span the REMI insertion site and the nucleotide 

20 sequence in this region is the same as that of recovered genomic DNA from the 
tagged site. This undoubtedly clarifies the identity of CPS1, which is the major 
biosynthetic gene. Mapping and sequencing of the two cosmids extended the 
sequence by 27.4 kb from the previously cloned fragment, leading to the 
characterization of 38.7 kb of contiguous genomic DNA, the largest genomic 

25 region analyzed so far in C. heterostrophus. In addition to CPS1 and TES1, 

sequence analysis of this region revealed at least 1 1 open reading frames; three 
of them, designated as DBZ1, CAT1 and DEC2, respectively, apparently encode 
functional proteins. The tight linkage of these genes suggests that they may be 
involved in the same pathway. 

30 In filamentous fungi, in some cases, genes in pathways for biosynthesis 

of secondary metabolites are dispersed on different chromosomes, e.g., the 
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cephalosporin C pathway genes in Acremonium chrysogenum (Mathison et al., 
Curr. Genets 23, 33, 1993) and the melanin pathway genes in Colletotrichum 
lagenarium (Kubo et al., AppL Environ. Microbiol. . 62, 4340, 1996). In other 
cases, tightly linked genes are usually found to be functionally related to a 
5 common pathway. This clustering organization has been exemplified by the 
sterigmatocystin pathway genes of Aspergillus nidulans, in which 25 
coordinately regulated transcripts are found in a 60 kb genomic region (Brown et 
al., 1996) and the trichothecene pathway genes of Fusarium sporotrichioides, in 
which 9 genes are clustered in a 25 kb region and 8 of them have been shown to 

10 be required for the pathway function (Hohn et al., Mol. Gen. Genet. , 248, 95, 
1995). The genes involved in biosynthesis of certain fungal peptides are also 
found as clusters. The tight linkage between CPS1 and these additional genes 
might reveal the presence of a novel secondary metabolite pathway in C 
heterostrophus. In this pathway, CPS1 is the major structural gene since it 

15 encodes a large multifunctional enzyme with all catalytic activities required for 
synthesis of a secondary metabolite, presumably a peptide phytotoxin; other 
genes may carry out different functions required for coordinate operation of the 
pathway, such as regulation, posttranslational modification or substrate 
processing as discussed below. 

20 Both functional and structural analyses strongly support the hypothesis 

that the CPS1 gene cluster controls a novel biosynthetic pathway. Pathway 
genes have been studied only in a few filamentous fungi mainly for industrial 
purposes (Keller et al., J. Ind. Microbiol. Biotechnol. . 19, 305, 1997). For plant 
pathogenic fungi, little is known about pathway genes for fungal pathogenesis. 

25 In C. heterostrophus, recent cloning of two Toxl genes PKS1 (Yang et al., 1996, 
supra) and DEC1 (Rose et al., 1996, supra) have contributed to a breakthrough 
in understanding the molecular mechanism for biosynthesis of T-toxin, a 
virulence determinant in the fungus/corn interaction. But further identification 
of related pathway genes has been unsuccessful because the two genes are 

30 located on different chromosomes and each is embedded in A+T-rich DNA 
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(Y oder et al., 1997, supra). In contrast, the CPS1 cluster provides a good 
opportunity to explore a pathogenesis pathway. 

First, it resides in a "normal" sequence region. G+C content of a 50-55% 
is found in most of the cloned sequences and no A+T-rich DNA is associated 
5 with either end of the cloned region. This would facilitate cloning of additional 
pathway genes by further chromosome walking, by screening of cosmid libraries 
or the targeted integration and plasmid rescue. Second, it contains a regulatory 
gene (DBZ1) which is presumably linked to a signal transduction pathway. 
Isolation of genes that interact with DBZ1 could reveal novel factors mediating 

10 the molecular communication between fungal pathogen and the host plant. 
Further characterization of DBZ1 (along with position-specific disruption or 
deletion) would be also helpful in determining the limit of the gene cluster, 
because tightly linked genes involved in a common pathway are often 
coordinately regulated by the same regulatory factor (Keller et al., 1997, supra). 

1 5 Finally, CPS] genes are found in both race T and race O, and its homologs are 
also found in other Cochliobolus species. Presence of high G+C content may 
imply that these genes evolved from a bacterial ancestor and the conservation in 
these fungi may correlate with the phytopathogenic function of the gene products 
encoded by the CPS1 cluster. Further investigation of this cluster should provide 

20 insights into the evolution of general pathogenicity factors among this group of 
fungi. 

Ferric reductases are a group of enzymes found in bacteria, fungi, plants 
and animals that are responsible for reduction of ferric iron to ferrous iron, an 
absorptive form used by the organism. They have been well studied in S. 
25 cervisiae, C. albicans and H. capsulatum and the like. The yeast FER1 has been 
expressed in tobacco (Oki et al., 1999). 

Previous studies have shown that FER genes could be important 
pathogenic determinants. Timmerman and Woods have proposed that in H. 
capsulatum FER could play critical roles in the acquisition of iron in three 
30 different ways: from inorganic or organic ferric salts, from host Fe(III) binding 



59 



WO 02/42444 



PCT/US01/43381 



proteins (transferrin and the like), and from siderophores produced by the fungus 
itself (to reduce and release the iron chelated by the siderophore molecules). 

On the other hand, iron sequestration in response to microbial infection 
has been demonstrated to be a host defense mechanism. The infection-related 
5 iron acquisition system in the pathogen can be considered to be an important 
mechanism against host defense and for a successful colonization by the 
pathogen in the host cells. This could be a general mechanism for all pathogenic 
fungi. 

CP SI does encode a peptide synthetase which is responsible for 
10 biosynthesis of a novel siderophore with unusual amino acid, hydroxyl acid and 
architecture, which is why CPS1 does not show similarity to common NRPSs. 
The CPS1 siderophore can compete with the host for iron acquisition when the 
fungus enters its host cells where the iron is limited due to host sequestration. In 
particular, for root pathogens such as C. victoriae, sequestration may be stronger 
15 in the root surface. This could explain why the cpsl mutant showed drastically 
reduced virulence. The FER1 could be required to release iron from the CPS1 
siderophore which explains its location near the CPS1 gene. Moreover, fungal 
strains could be cultured in iron-limiting conditions because CPS1, and likely 
other genes in the cluster maybe turned on only during conditions of iron 
20 depletion. 

In a preferred embodiment, the polypeptides, including those having 
substantially similar activities to SEQ ID NO:47, SEQ ID NO:49, or SEQ ID 
NO: 56 are encoded by nucleotide sequences derived from fungi, preferably from 
pathogenic fungi, desirably identical or substantially similar to the nucleotide 

25 sequences set forth in SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:55 or the 
complement thereof. 

In another preferred embodiment, the present invention describes a 
method for identifying agents having the ability to inhibit or reduce the activity 
of any one or more of SEQ ID NO:47, SEQ ID NO:49 or SEQ ID NO:56 in 

30 fungi. Preferably, a transgenic "knockout" fungus and/or fungal cell, is obtained 
which preferably is stably transformed, which comprises a deletion in any of 
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SEQ ID NO:46, SEQ ID NO:48 or SEQ ID NO:55. Thus, in one embodiment, 
the gene product encoded by the nucleotide sequence is not expressed, or has 
reduced or aberrant expression. In another embodiment, the transgenic fungus or 
cell comprises the corresponding non-deleted sequences linked to a promoter to 
5 yield a gene product which is overexpressed. An agent is then contacted with the 
transgenic fungus and/or cell, and the growth development, virulence or 
pathogenicity of the transgenic fungus and/or cell is determined relative to the 
growth, development, or pathogenicity, of the corresponding transgenic fungus 
and/or cell to which the agent was not applied; or to the corresponding non- 

1 0 transgenic fungus and/or cell. 

The present invention generally relates to an isolated nucleic acid 
molecule from a fungal pathogen encoding a CP SI peptide synthetase, an iron 
reductase or a permease/MFS transporter. In a preferred embodiment, a DNA 
molecule has a nucleotide sequence which hybridizes to a DNA molecule having 

15 a sequence corresponding to SEQ ID NO:46, SEQ ID NO:48 or SEQ ID NO:55. 
Other DNA molecules of the present invention include DNA molecules that 
have a sequence which is greater than 65% identical to the nucleotide sequence 
of SEQ ID NO:46, SEQ ID NO: 48 or SEQ ID NO:55. Nucleotide sequence 
similarity is determined by the BLAST program with the default parameters 

20 (Altschul et aL, "Basic Local Alignment Search Tool," J. Mol. Biol. . 215:403 

(1990). Preferred sequences include those DNA molecules which will hybridize 
to a nucleic acid molecule having the sequence of SEQ ID NO:46, SEQ ID 
NO:48, SEQ ID NO:55 or the complement thereof. Preferably, the DNA 
molecules hybridize to SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:55, or its 

25 complement under low or moderate, or stringent conditions. 

Other proteins or polypeptides of the present invention include 
polypeptides having an amino acid sequence which has at least 75% similarity to 
the amino acid sequence of SEQ ID NO:47, SEQ ID NO:49 or SEQ ID NO:56. 
In a preferred embodiment of the invention, the protein or polypeptide will have 

30 at least 90% similarity with SEQ ID NO:47, SEQ ID NO:49 or SEQ ID NO:56. 
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In addition, the nucleic acid molecules of the invention may be modified, 
adapted, and optimized in such a manner that, when transferred into an 
appropriate host cell, the modified polynucleotide confers an altered phenotype 
brought about by the polypeptide encoded by the modified sequence. One 
5 advantage of this method is that it can be used to rapidly evolve any protein 
without knowledge of its structure. Peptide synthetase, iron reductase and/or 
permease/MFS transporter polynucleotides can be altered using sequence- 
shuffling methods as described by WO 00/28008 and references therein. Peptide 
synthetases of the invention can be recombined with other peptide synthetases, 

10 iron reductases and/or permeases/MFS transporters to generate peptide 

synthetases, iron reductases and/or permeases/MFS transporters of desired and/or 
novel specificity and/or activity, and thus generate desired and/or novel non- 
encoded peptide products. Such novel peptide synthetases, iron reductases 
and/or permeases/MFS transporters would have at least one active domain or 

15 other desired property-imparting domain (e.g., binding, enzymatic activity, 
specificity determining). 

Briefly, sequences or fragments of sequences are shuffled by various 
recombinatorial methods, the shuffled polynucleotide is introduced into a 
suitable host for expression, the resulting phenotype is measured and the 

20 modified phenotype is compared with the phenotype produced by unmodified 
sequence. Here, "phenotype" refers to the trait of interest and may include 
measuring the amount, conformation, composition, or enzymatic activity of the 
polypeptide encoded, if the sequence shuffling is being performed, to modify a 
single protein. Phenotype may also be assessed by measuring the effect of 

25 expression of the modified peptide synthetase, iron reductase and/or 

permease/MFS transporter polynucleotide on expression of other genes, on 
cellular processes such as respiration or glycolysis, on tissue-level processes such 
as cell shape and size, and on organismal traits such as pathogenicity and/or 
virulence. Sequence-shuffled peptide synthetase polynucleotides producing a 

30 desirable phenotype are then selected, further modified, and the resulting 
phenotype is measured. The shuffling and selection process is performed 
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iteratively until sequence shuffled polynucleotides encoding at least one 
polypeptide producing the desired phenotype is obtained, or until optimization of 
the trait of interest has plateaued and no further improvement is seen in 
subsequence rounds of shuffling and selection. Alternately, multiple rounds of 
5 recombination of peptide synthetase sequences may be performed prior to any 
selection step, with the aim of increasing the diversity of resulting populations 
nucleic acids prior to selection. 

At least five general classes of recombination methods may be applied to 
peptide synthetase, iron reductase and/or permease/MFS transporter 

10 polynucleotides. First, the nucleic acids of peptide synthetase, iron reductase 

and/or permease/MFS transporter polynucleotides can be recombined in vitro by 
any of a variety of techniques including DNAse digestion of polynucleotides 
followed by ligation and/or PCR reassembly of the polynucleotides. Second, 
polynucleotides can be recursively recombined in vivo, for example by allowing 

1 5 recombination to occur between an introduced peptide synthetase, iron reductase 
and/or permease/MFS transporter polynucleotide and homologous sequences in a 
cell. Third, whole cell genome recombination methods can be used in which 
whole genomes of cells are recombined, optionally including spiking the 
genomic (nuclear and/or plastid) recombination mixtures with the peptide 

20 synthetase, iron reductase and/or permease/MFS transporter sequences of 
interest. Fourth, synthetic recombination methods can be used, in which 
oligonucleotides corresponding to different homologs of the peptide synthetase, 
iron reductase and/or permease/MFS transporter sequence are synthesized and 
reassembled in PCR or ligation reactions which also include oligonucleotides 

25 which correspond to more than one allelic variant, thereby generating new 

recombined polynucleotides. Fifth, in silico methods of recombination can be 
carried out in which genetic algorithms are used in a computer to recombine 
sequence strings which correspond to homologs of the peptide synthetase 
sequences of interest. The resulting recombined sequence strings are optionally 

30 converted into nucleic acids by synthesis of nucleic acids which correspond to 
the recombined sequences. Such synthesis could proceed by oligonucleotide 
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synthesis and gene reassembly techniques. Any of the preceding general 
recombination formats can be practiced reiteratively to generate a more diverse 
set of recombinant nucleic acids. 

The ever-increasing quantity and quality of data being accumulated not 
5 only about gene sequence, structure and function, but also about gene expression 
patterns and proteins interactions on genomic scales, makes it no longer feasible 
to deal with genetic data on an item-by-item basis but instead, necessary to create 
new ways of discovering biological information by in silico data mining. "Data 
mining" as used herein, refers to exploration and analysis of large quantities of 

10 data, by automatic and semi-automatic means, in order to discover meaningful 
patterns and rules. Data mining is applied to molecular sequence and structure 
data, gene expression and other high-throughput data, and to existing knowledge 
in the scientific literature, including making meaningful connections between 
different forms of knowledge and data. 

15 A variety of data mining tools can be applied using the peptide 

synthetase, iron reductase and/or permease/MFS transporter sequences of the 
present invention. A method appropriate for use in sequence databases which 
contain long stretches of data known as long-pattern data sets, is that disclosed in 
U.S. Patent No. 6,138,1 17, which uses a look-ahead scheme for quickly 

20 identifying long patterns that is not limited to the initialization phase, an 

heuristic item-ordering policy for tightly focusing the search, and a support- 
lower-bounding scheme that is also applicable to other algorithms. Recursive 
partitioning is useful to elucidate structure-activity relations and to guide 
decision-making for high-throughput screening of compounds for their effects on 

25 peptide synthetase polypeptides, for example as described by Hertzog et al. (X 
Pharmacol Toxicol Methods 42:207 (1999)) for sequential screening of G- 
protein-coupled receptors. The peptide synthetase, iron reductase and/or 
permease/MFS transporter sequences of the present invention may be applied to 
digital differential display (DDD) to analyze differential expression and create an 

30 electronic expression profile for a variety of physiological conditions. Peptide 
synthetase, iron reductase and/or permease/MFS transporter sequence data can 
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be analyzed to predict protein domains using the BLAST algorithm. Higher- 
order correlations among peptide synthetase, iron reductase and/or 
permease/MFS transporter proteins may be predicted by using peptide synthetase 
protein sequence data to compare sets of sequence-distant sites displaying high 
5 mutual information which may bespeak important structural or functional 
features, a methodology that overcomes the limitations of previous methods 
which examined only single-residue features or pairwise interactions. (Steeg et 
al., Pac Svmp Biocomput 1998:573 (1998)). 

Peptide synthetase, iron reductase and/or permease/MFS transporter 

10 polypeptide sequences having structures expressed in a computer-readable form 
can be evaluated for function using functional site descriptors (FSDs) for a 
biomolecule functional site having a specific biological function, as described in 
the publication WO 00/1 1206. FSDs can be used to identify or screen for a 
novel function in one or more peptide synthetase, iron reductase and/or 

1 5 permease/MFS transporter polypeptides, to confirm a previously identified or 

suspected function of a protein, to evaluation the effects of sequence shuffling on 
protein function, or to provide further information about a specific functional site 
in a peptide synthetase, iron reductase and/or permease/MFS transporter 
polypeptide. 

20 FSDs are geometric representations of protein functional sites, typically 

defining spatial configurations of functional sites by providing a three- 
dimensional (3D) representation of a protein functional site. Preferred functional 
sites represented by FSDs include a ligand binding domain, an ion or cofactor 
binding site, a site or domain for protein-protein interaction, or an enzymatic 

25 active site. An FSD typically comprises a set of geometric constraints for one or 
more atoms in each of two or more amino acid residues comprising a function 
site of a protein. Geometric constraints of an FSD may comprise an atomic 
position specified by a set of 3D coordinates, an interatomic distance, an 
interatomic bond angle, or conformational constraints imposed by residues at a 

30 site or by secondary structure such as a zinc finger, leucine zipper, helix, or a 
strand, where these constraints may be expressed either as fixed coordinates or 
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ranges. Libraries of FSDs can comprise at least two FSDs for at least one of the 
biological functions represented by the library. 

FSDs are used to probe protein structures to determine if such structures 
contain the functional sites described by the corresponding FSDs. Peptide 
5 synthetase, iron reductase and/or permease/MFS transporter polypeptides to be 
screened can comprise an unmodified sequence selected from SEQ ID NO:47, 
SEQ ID NO:49 or SEQ ID NO:56, or a modified form derived from random or 
directed sequence shuffling as previously described. Typically, functional 
screening methods comprise applying a FSD to a structure of a peptide 

10 synthetase, iron reductase and/or permease/MFS transporter polypeptide, where 
the structure may be determined by x-ray crystallography, nuclear magnetic 
resonance, by a computer "ab initio " folding program a homology program, or a 
"threading" program, and expressed in a computer-readable form. 
The function of a peptide synthetase, iron reductase and/or 

1 5 permease/MFS transporter polypeptide whose structure is expressed in 

computer-readable form can be screened by applying an FSD to the structure of a 
peptide synthetase, iron reductase and/or permease/MFS transporter polypeptide 
and determining whether the peptide synthetase, iron reductase and/or 
permease/MFS transporter polypeptide structure matches, or satisfies, the 

20 constraints of the FSD. Libraries of FSDs can be used to probe for or evaluate 
the activity or function associated with the FSD in one or more protein 
structures. 

The DNA molecule encoding the CP SI, iron reductase polypeptide and/or 
permease/MFS transporter of the present invention can be incorporated in cells 

25 using conventional recombinant DNA technology. Generally, this involves 
inserting the DNA molecule into an expression system to which the DNA 
molecule is heterologous (i.e., not normally present). The heterologous DNA 
molecule is inserted into the expression system or vector in proper sense 
orientation and correct reading frame. The vector contains the necessary 

30 elements for the transcription and translation of the inserted protein-coding 

sequences. U.S. Patent No. 4,237,224, describes the production of expression 
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systems in the form of recombinant plasmids using restriction enzyme cleavage 
and ligation with DNA ligase. These recombinant plasmids are then introduced 
by means of transformation arid replicated in unicellular cultures including 
prokaryotic organisms and eukaryotic cells grown in culture. Recombinant 
5 genes may also be introduced into viruses, such as vaccinia virus. Recombinant 
viruses can be generated by transfection of plasmids into cells infected with 
virus. 

Suitable vectors include, but are not limited to, the following viral 
vectors such as lambda vector system gtl 1, gtWEST.B, Charon 4, and plasmid 

1 0 vectors such as pBR22, pBR325, pACYC 1 77, pAC YC184, pUC8, pUC9, pUC18, 
pUC19, pLG339, pR290, pKC37, pKClOl, SV40, pBluescript II SK +/- or KS 
+/- (see "Stratagene Cloning Systems" Catalog (1993) from Stratagene, La Jolla, 
Calif), pQE, pIH821, pGEX, pET series (see Studier et. al., "Use of T7 RNA 
Polymerase to Direct Expression of Cloned Genes," Gene Expression 

1 5 Technology, vol. 1 85 (1 990)), and any derivatives thereof. Suitable vectors are 
continually being developed and identified. Recombinant molecules can be 
introduced into cells via transformation, transduction, conjugation, mobilization, 
or electroporation. The DNA sequences are cloned into the vector using 
standard cloning procedures in the art, as described by Maniatis et al. or 

20 Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Springs 
Laboratory, Cold Springs Harbor, New York (1982 or 1989, respectively). 

A variety of host- vector systems may be utilized to express the protein- 
encoding sequence(s). Primarily, the vector system must be compatible with the 
host cell used. Host- vector systems include but are not limited to the following: 

25 bacteria transformed with bacteriophage DNA, plasmid DNA) or cosmid DNA; 
microorganisms such as yeast containing yeast vectors; mammalian cell systems 
infected with virus (e.g., vaccinia virus, adenovirus, etc.); insect cell systems 
infected with virus (e.g., baculovirus); and plant cells infected by bacteria or 
transformed via particle bombardment (i.e., biolistics). The expression elements 

30 of these vectors vary in their strength and specificities. Depending upon the 
host- vector system utilized, any one of a number of suitable transcription and 
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translation elements can be used. Different genetic signals and processing events 
control many levels of gene expression (e.g., DNA transcription and messenger 
RNA, "mRNA'' translation). Transcription of DNA is dependent upon the 
presence of a promoter which is a DNA sequence that directs the binding of 
5 RNA polymerase and thereby promotes mRNA synthesis. The DNA sequences 
of eukaryotic promoters differ from those of prokaryotic promoters. 
Furthermore, eukaryotic promoters and accompanying genetic signals may not be 
recognized in or may not function in a procaryotic system, and, further, 
prokaryotic promoters are not recognized and do not function in eukaryotic cells. 

10 Similarly, translation of DNA in procaryotes depends upon the presence of the 
proper prokaryotic signals which differ from those of eukaryotes. Efficient 
translation of DNA in procaryotes requires a ribosome binding site called the 
Shine-Dalgarno ("SD") sequence on the mRNA. This sequence is a short 
nucleotide sequence of mRNA that is located before the start codon, usually 

1 5 AUG, which encodes the amino-terminal methionine of the protein. The SD 
sequences are complementary to the 3'-end of the 165, rRNA (ribosomal RNA) 
and probably promote binding of mRNA to ribosomes by duplexing with the 
rRNA to allow correct positioning of the ribosome. For a review on maximizing 
gene expression, see Roberts and Lauer, Methods in Enzymologv 68:473 (1979). 

20 Promoters vary in their "strength" (i.e., their ability to promote 

transcription). For the purposes of expressing a cloned gene, it is desirable to 
use strong promoters in order to obtain a high level of transcription and, hence, 
expression of the gene. Depending upon the host cell system utilized, any one of 
a number of suitable promoters may be used. For instance, when cloning in E. 

25 coli; its bacteriophages, or plasmids, promoters such as the phage promoter, lac 
promoter, trp promoter, recA promoter, ribosomal RNA promoter, the PR and 
PL promoters of coliphage lambda and others, including but not limited, to 
/acUV5, orapF, bla, Ipp, and the like, may be used to direct high levels of 
transcription of adjacent DNA segments. Additionally, a hybrid trp-lac\TV5 

30 (tac) promoter or other E. coli promoters produced by recombinant DNA or other 
synthetic DNA techniques may be used to provide for transcription of the insert 
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gene. Bacterial host cell strains and expression vectors may be chosen which 
inhibit the action of the promoter unless specifically induced. In certain operons, 
the addition of specific inducers is necessary for efficient transcription of the 
inserted DNA. For example, the lac operon is induced by the addition of lactose 
5 or IPTG (isopropylthiobeta-D-galactoside). A variety of other operons, such as 
trp, pro, etc., are under different controls. Specific initiation signals are also 
required for efficient gene transcription and translation in prokaryotic cells. 
These transcription and translation initiation signals may vary in "strength" as 
measured by the quantity of gene specific messenger RNA and protein 

10 synthesized, respectively. The DNA expression vector, which contains a 

promoter, may also contain any combination of various "strong" transcription 
and/or translation initiation signals. For instance, efficient translation in E. coli 
requires a Shine-Dalgarno ("SD" sequence about 7-9 bases 5' to the initiation 
codon ("ATG") to provide a ribosome binding site. Thus, any SD-ATG 

1 5 combination that can be utilized by host cell ribosomes may be employed. Such 
combinations include but are not limited to the SD-ATG combination from the 
cro gene or the N gene of coliphage lambda, or from the E. coli tryptophan E, D, 
C, B or A genes. Additionally, any SD-ATG combination produced by 
recombinant DNA or other techniques involving incorporation of synthetic 

20 nucleotides may be used. The present invention also relates to anti-sense nucleic 
acid for essential cell proteins, such as replication proteins which serve to tender 
host cells incapable of further cell growth and division. Anti-sense regulation 
has been described by Rosenberg et al., Nature , 313:703 (1985); Preiss et al., 
Nature, 313:27 (1985); Melton, Proc. Natl. Acad. Sci. USA . 82:144 (1985); Izaut 

25 et al., Science . 229:342 (1985); Kim et al., Cell, 42:129 (1985); Bestka et al., 
Proc Nati. Acad. Sci. USA . 81:7525 (1984); Coleman et al., Cell, 37:429 
(1984); and McQany et al., Proc. Natl. Acad. Sci. USA . 83:399 (1986), which 
are hereby incorporated by reference. 

Once the isolated DNA molecules encoding the CP SI polypeptide or iron 

30 reductase have been cloned into an expression system, they are ready to be 
incorporated into a host cell. Such incorporation can be carried out by the 
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various forms of transformation noted above, depending upon the vector host 
cell system. Suitable host cells include, but are not limited to, bacteria, virus, 
yeast, mammalian cells, insect, plant, and the like. In the present invention, the 
host cells are from plants such as corn, oat, grass, weeds, bamboo, and 
5 sugarcane. In this aspect of the present invention, large numbers of compounds 
can be screened for their activity as inhibitors of CP SI protein, iron reductase or 
permease/MFS transporter by a high throughput screening assay as described in 
U.S. Patent No. 5,767,946. Generally, a library of compounds is assayed for 
inhibition of an enzyme catalyzed reaction and the amounts of fluorescence 

10 bound to individual suspendable solid supports measured to determine the degree 
of inhibition. For example, the amount of fluorescence bound to a microbead in 
the presence of inhibitory compounds is greater than for non-inhibitory 
compounds. The amounts of fluorescence bound to individual beads are 
determined by confocal microscopy. Using this type of assay, inhibition can be 

15 determined, e.g., of a peptide synthetase such as CPS1. For CPS1 the substrate 
can be amino acids (or hydroxy acids), linked at one end to the microbead and at 
the other end to a fluorescent label. The enzyme inhibitors can be utilized to 
impart fungal resistance to a variety of vertebrate organisms. 

Another aspect of the present invention involves using one or more of the 

20 above DNA molecules encoding the CPS1 polypeptide or a gene encoding an 

enzyme that degrades the CPS1 product to transform organisms to impart fungal 
resistance to the organism. This concept of pathogen-derived resistance, 
according to U.S. Patent No. 5,840,481 is that host resistance to a particular 
parasite can effectively be engineered by introducing a gene, gene fragment, or 

25 modified gene or gene fragment of the pathogen into the host. This approach is 
based on the fact that in any parasite-host interaction, there are certain parasite- 
encoded cellular functions (activities) that are essential to the parasite but not to 
the host and that when one of the essential functions of the parasite such as 
survival or reproduction is disrupted, the parasitic process will be stopped. 

30 "Disruption" refers to any change that diminishes the survival, reproduction, or 
ineffectivity of the parasite. Such essential functions, which are under the 
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control of the parasite ? s genes, can be disrupted by the presence of a 
corresponding gene product in the host which is (1) dysfunctional, (2) in excess, 
or (3) appears in the wrong context or at the wrong developmental stage in the 
parasite's life cycle. If such faulty signals are designed specifically for parasitic 
5 cell functions, they will have little effect on the host. Therefore, the procedure 
for making organisms, for example, resistant to infection by one or more fungus 
involve isolating DNA coding for a gene such as CP SI of a fungus, operably 
linking the DNA within an expression vector; and transforming a cell or tissue 
with the expression vector. The transformed cells or tissue in the presence of the 
10 fungus such as Cochliobolus heterostrophus where the CPS1 DNA is expressed 
as a gene product and the CPS protein disrupts the essential activity of the fungi. 

Dosages, Formulations and Routes of Administration of the Agents of the 
Invention 

1 5 The therapeutic agents identified by the methods of the invention may be 

administered at dosages of at least about 0.01 to about 100 mg/kg, more 
preferably about 0.1 to about 50 mg/kg, and even more preferably about 0.1 to 
about 30 mg/kg, of body weight, although other dosages may provide beneficial 
results. The amount administered will vary depending on various factors 

20 including, but not limited to, the agent chosen, the disease, whether prevention or 
treatment is to be achieved, and if the agent is modified for bioavailability and in 
vivo stability. 

Administration of a sense or antisense nucleic acid molecule encoding a 
therapeutic agent may be accomplished through the introduction of cells 

25 transformed with an expression cassette comprising the nucleic acid molecule 
(see, for example, WO 93/02556) or the administration of the nucleic acid 
molecule (see, for example, Feigner et al., U.S. Patent No. 5,580,859, Pardoll et 
al.. Immunity. 3:165 (1995); Stevenson et al., Immunol. Rev. . 145:211 (1995); 
Moiling, J. Mol. Med. . 75:242 (1997); Donnelly et al., Ann. N.Y. Acad. Sci . 

30 772:40 (1995); Yang et al., Mol. Med. Today . 2:476 (1996); Abdallah et al., 
Biol. Cell . 85:1 (1995)). Pharmaceutical formulations, dosages and routes of 
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administration for nucleic acids are generally disclosed, for example, in Feigner 
et al., supra. 

The therapeutic agents of the invention are amenable to chronic use for 
prophylactic purposes, preferably by systemic administration. 
5 Administration of the therapeutic agents in accordance with the present 

invention may be continuous or intermittent, depending, for example, upon the 
recipient's physiological condition, whether the purpose of the administration is 
therapeutic or prophylactic, and other factors known to skilled practitioners. The 
administration of the agents of the invention may be essentially continuous over 

10 a preselected period of time or may be in a series of spaced doses. Both local 
and systemic administration is contemplated. 

One or more suitable unit dosage forms comprising the therapeutic agents 
of the invention, which, as discussed below, may optionally be formulated for 
sustained release, can be administered by a variety of routes including oral, or 

15 parenteral, including by rectal, buccal, vaginal and sublingual, transdermal, 
subcutaneous, intravenous, intramuscular, intraperitoneal, intrathoracic, 
intrapulmonary and intranasal routes. The formulations may, where appropriate, 
be conveniently presented in discrete unit dosage forms and may be prepared by 
any of the methods well known to pharmacy. Such methods may include the 

20 step of bringing into association the therapeutic agent with liquid carriers, solid 
matrices, semi-solid carriers, finely divided solid carriers or combinations 
thereof, and then, if necessary, introducing or shaping the product into the 
desired delivery system. 

When the therapeutic agents of the invention are prepared for oral 

25 administration, they are preferably combined with a pharmaceutically acceptable 
carrier, diluent or excipient to form a pharmaceutical formulation, or unit dosage 
form. The total active ingredients in such formulations comprise from 0.1 to 
99.9% by weight of the formulation. By "pharmaceutically acceptable" it is 
meant the carrier, diluent, excipient, and/or salt must be compatible with the 

30 other ingredients of the formulation, and not deleterious to the recipient thereof. 
The active ingredient for oral administration may be present as a powder or as 
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granules; as a solution, a suspension or an emulsion; or in achievable base such 
as a synthetic resin for ingestion of the active ingredients from a chewing gum. 
The active ingredient may also be presented as a bolus, electuary or paste. 

Formulations suitable for vaginal administration maybe presented as 
5 pessaries, tampons, creams, gels, pastes, douches, lubricants, foams or sprays 
containing, in addition to the active ingredient, such carriers as are known in the 
art to be appropriate. Formulations suitable for rectal administration may be 
presented as suppositories. 

Pharmaceutical formulations containing the therapeutic agents of the 

1 0 invention can be prepared by procedures known in the art using well-known and 
readily available ingredients. For example, the agent can be formulated with 
common excipients, diluents, or carriers, and formed into tablets, capsules, 
suspensions, powders, and the like. Examples of excipients, diluents, and 
carriers that are suitable for such formulations include the following fillers and 

15 extenders such as starch, sugars, mannitol, and silicic derivatives; binding agents 
such as carboxymethyl cellulose, HPMC and other cellulose derivatives, 
alginates, gelatin, and polyvinylpyrrolidone; moisturizing agents such as 
glycerol; disintegrating agents such as calcium carbonate and sodium 
bicarbonate; agents for retarding dissolution such as paraffin; resorption 

20 accelerators such as quaternary ammonium compounds; surface active agents 
such as cetyl alcohol, glycerol monostearate; adsorptive carriers such as kaolin 
and bentonite; and lubricants such as talc, calcium and magnesium stearate, and 
solid polyethyl glycols. 

For example, tablets or caplets containing the agents of the invention can 

25 include buffering agents such as calcium carbonate, magnesium oxide and 

magnesium carbonate. Caplets and tablets can also include inactive ingredients 
such as cellulose, pregelatinized starch, silicon dioxide, hydroxy propyl methyl 
cellulose, magnesium stearate, microcrystalline cellulose, starch, talc, titanium 
dioxide, benzoic acid, citric acid, corn starch, mineral oil, polypropylene glycol, 

30 sodium phosphate, and zinc stearate, and the like. Hard or soft gelatin capsules 
containing an agent of the invention can contain inactive ingredients such as 
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gelatin, microcrystalline cellulose, sodium lauryl sulfate, starch, talc, and 
titanium dioxide, and the like, as well as liquid vehicles such as polyethylene 
glycols (PEGs) and vegetable oil. Moreover, enteric coated caplets or tablets of 
an agent of the invention are designed to resist disintegration in the stomach and 
5 dissolve in the more neutral to alkaline environment of the duodenum. 

The therapeutic agents of the invention can also be formulated as elixirs 
or solutions for convenient oral administration or as solutions appropriate for 
parenteral administration, for instance by intramuscular, subcutaneous or 
intravenous routes. 

10 The pharmaceutical formulations of the therapeutic agents of the 

invention can also take the form of an aqueous or anhydrous solution or 
dispersion, or alternatively the form of an emulsion or suspension. 

Thus, the therapeutic agent may be formulated for parenteral 
administration (e.g., by injection, for example, bolus injection or continuous 

1 5 infusion) and may be presented in unit dose form in ampules, pre-filled syringes, 
small volume infusion containers or in multi-dose containers with an added 
preservative. The active ingredients may take such forms as suspensions, 
solutions, or emulsions in oily or aqueous vehicles, and may contain formulatory 
agents such as suspending, stabilizing and/or dispersing agents. Alternatively, 

20 the active ingredients may be in powder form, obtained by aseptic isolation of 
sterile solid or by lyophilization from solution, for constitution with a suitable 
vehicle, e.g., sterile, pyrogen-free water, before use. 

These formulations can contain pharmaceutical^ acceptable vehicles and 
adjuvants which are well known in the prior art. It is possible, for example, to 

25 prepare solutions using one or more organic solvent(s) that is/are acceptable 
from the physiological standpoint, chosen, in addition to water, from solvents 
such as acetone, ethanol, isopropyl alcohol, glycol ethers such as the products 
sold under the name "Dowanol", polyglycols and polyethylene glycols, Q-C4 
alkyl esters of short-chain acids, preferably ethyl or isopropyl lactate, fatty acid 

30 triglycerides such as the products marketed under the name "Miglyol", isopropyl 
myristate, animal, mineral and vegetable oils and polysiloxanes. 
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The compositions according to the invention can also contain thickening 
agents such as cellulose and/or cellulose derivatives. They can also contain 
gums such as xanthan, guar or carbo gum or gum arabic, or alternatively 
polyethylene glycols, bentones and montmorillonites, and the like. 
5 It is possible to add, if necessary, an adjuvant chosen from antioxidants, 

surfactants, other preservatives, film-forming, keratolytic or comedolytic agents, 
perfumes and colorings. Also, other active ingredients maybe added, whether 
for the conditions described or some other condition. 

For example, among antioxidants, t-butylhydroquinone, butylated 

10 hydroxyanisole, butylated hydroxytoluene and a -tocopherol and its derivatives 
may be mentioned. The galenical forms chiefly conditioned for topical 
application take the form of creams, milks, gels, dispersion or microemulsions, 
lotions thickened to a greater or lesser extent, impregnated pads, ointments or 
sticks, or alternatively the form of aerosol formulations in spray or foam form or 

1 5 alternatively in the form of a cake of soap. 

Additionally, the agents are well suited to formulation as sustained 
release dosage forms and the like. The formulations can be so constituted that 
they release the active ingredient only or preferably in a particular part of the 
intestinal or respiratory tract, possibly over a period of time. The coatings, 

20 envelopes, and protective matrices may be made, for example, from polymeric 
substances, such as polylactide-glycolates, liposomes, microemulsions, 
microparticles, nanoparticles, or waxes. These coatings, envelopes, and 
protective matrices are useful to coat indwelling devices, e.g., stents, catheters, 
peritoneal dialysis tubing, and the like. 

25 The therapeutic agents of the invention can be delivered via patches for 

transdermal administration. See U.S. Patent No. 5,560,922 for examples of 
patches suitable for transdermal delivery of a therapeutic agent. Patches for 
transdermal delivery can comprise a backing layer and a polymer matrix which 
has dispersed or dissolved therein a therapeutic agent, along with one or more 

30 skin permeation enhancers. The backing layer can be made of any suitable 
material which is impermeable to the therapeutic agent. The backing layer 
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serves as a protective cover for the matrix layer and provides also a support 
function. The backing can be formed so that it is essentially the same size layer 
as the polymer matrix or it can be of larger dimension so that it can extend 
beyond the side of the polymer matrix or overlay the side or sides of the polymer 
5 matrix and then can extend outwardly in a manner that the surface of the 
extension of the backing layer can be the base for an adhesive means. 
Alternatively, the polymer matrix can contain, or be formulated of, an adhesive 
polymer, such as polyacrylate or acrylate/vinyl acetate copolymer. For long-term 
applications it might be desirable to use microporous and/or breathable backing 

1 0 laminates, so hydration or maceration of the skin can be minimized. 

Examples of materials suitable for making the backing layer are films of 
high and low density polyethylene, polypropylene, polyurethane, 
polyvinylchloride, polyesters such as poly(ethylene phthalate), metal foils, metal 
foil laminates of such suitable polymer films, and the like. Preferably, the 

1 5 materials used for the backing layer are laminates of such polymer films with a 
metal foil such as aluminum foil. In such laminates, a polymer film of the 
laminate will usually be in contact with the adhesive polymer matrix. 

The backing layer can be any appropriate thickness which will provide 
the desired protective and support functions. A suitable thickness will be from 

20 about 10 to about 200 microns. 

Generally, those polymers used to form . the biologically acceptable 
adhesive polymer layer are those capable of forming shaped bodies, thin walls or 
coatings through which therapeutic agents can pass at a controlled rate. Suitable 
polymers are biologically and pharmaceutically compatible, nonallergenic and 

25 insoluble in and compatible with body fluids or tissues with which the device is 
contacted. The use of soluble polymers is to be avoided since dissolution or 
erosion of the matrix by skin moisture would affect the release rate of the 
therapeutic agents as well as the capability of the dosage unit to remain in place 
for convenience of removal. 

30 Exemplary materials for fabricating the adhesive polymer layer include 

polyethylene, polypropylene, polyurethane, ethylene/propylene copolymers, 
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ethylene/ethylacrylate copolymers, ethylene/vinyl acetate copolymers, silicone 
elastomers, especially the medical-grade polydimethylsiloxanes, neoprene 
rubber, polyisobutylene, polyacrylates, chlorinated polyethylene, polyvinyl 
chloride, vinyl chloride- vinyl acetate copolymer, crosslinked polymethacrylate 
5 polymers (hydrogel), polyvinylidene chloride, poly(ethylene terephthalate), butyl 
rubber, epichlorohydrin rubbers, ethylenvinyl alcohol copolymers, ethylene- 
vinyloxyethanol copolymers; silicone copolymers, for example, polysiloxane- 
polycarbonate copolymers, polysiloxanepolyethylene oxide copolymers, 
polysiloxane-polymethacrylate copolymers, polysiloxane-alkylene copolymers 
10 (e.g., polysiloxane-ethylene copolymers), polysiloxane-alkylenesilane 
copolymers (e.g., polysiloxane-ethylenesilane copolymers), and the like; 
cellulose polymers, for example methyl or ethyl cellulose, hydroxy propyl methyl 
cellulose, and cellulose esters; polycarbonates; polytetrafluoroethylene; and the 
like. 

1 5 Preferably, a biologically acceptable adhesive polymer matrix should be 

selected from polymers with glass transition temperatures below room 
temperature. The polymer may, but need not necessarily, have a degree of 
crystallinity at room temperature. Cross-linking monomelic units or sites can be 
incorporated into such polymers. For example, cross-linking monomers can be 

20 incorporated into polyacrylate polymers, which provide sites for cross-linking the 
matrix after dispersing the therapeutic agent into the polymer. Known cross- 
linking monomers for polyacrylate polymers include polymethacrylic esters of 
polyols such as butylene diacrylate and dimethacrylate, trimethylol propane 
trimethacrylate and the like. Other monomers which provide such sites include 

25 allyl acrylate, allyl methacrylate, diallyl maleate and the like. 

Preferably, a plasticizer and/or humectant is dispersed within the 
adhesive polymer matrix. Water-soluble polyols are generally suitable for this 
purpose. Incorporation of a humectant in the formulation allows the dosage unit 
to absorb moisture on the surface of skin which in turn helps to reduce skin 

30 irritation and to prevent the adhesive polymer layer of the delivery system from 
failing. 
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Therapeutic agents released from a transdermal delivery system must be 
capable of penetrating each layer of skin. In order to increase the rate of 
permeation of a therapeutic agent, a transdermal drug delivery system must be 
able in particular to increase the permeability of the outermost layer of skin, the 
5 stratum corneum, which provides the most resistance to the penetration of 
molecules. The fabrication of patches for transdermal delivery of therapeutic 
agents is well known to the art. 

For administration to the upper (nasal) or lower respiratory tract by 
inhalation, the therapeutic agents of the invention are conveniently delivered 

10 from an insufflator, nebulizer or a pressurized pack or other convenient means of 
delivering an aerosol spray. Pressurized packs may comprise a suitable 
propellant such as dichlorodifluoromethane, trichlorofluoromethane, 
dichlorotetrafluoroethane, carbon dioxide or other suitable gas. In the case of a 
pressurized aerosol, the dosage unit may be determined by providing a valve to 

1 5 deliver a metered amount. 

Alternatively, for administration by inhalation or insufflation, the 
composition may take the form of a dry powder, for example, a powder mix of 
the therapeutic agent and a suitable powder base such as lactose or starch. The 
powder composition may be presented in unit dosage form in, for example, 

20 capsules or cartridges, or, e.g., gelatine or blister packs from which the powder 
maybe administered with the aid of an inhalator, insufflator or a metered-dose 
inhaler. 

For intra-nasal administration, the therapeutic agent may be administered 
via nose drops, a liquid spray, such as via a plastic bottle atomizer or metered- 
25 dose inhaler. Typical of atomizers are the Mistometer (Wintrop) and the 
Medihaler (Riker). 

The local delivery of the therapeutic agents of the invention can also be 
by a variety of techniques which administer the agent at or near the site of 
disease. Examples of site-specific or targeted local delivery techniques are not 
30 intended to be limiting but to be illustrative of the techniques available. 

Examples include local delivery catheters, such as an infusion or indwelling 
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catheter, e.g., a needle infusion catheter, shunts and stents or other implantable 
devices, site specific carriers, direct injection, or direct applications. 

For topical administration, the therapeutic agents maybe formulated as is 
known in the art for direct application to a target area. Conventional forms for 
5 this purpose include wound dressings, coated bandages or other polymer 

coverings, ointments, creams, lotions, pastes, jellies, sprays, and aerosols, as well 
as in toothpaste and mouthwash, or by other suitable forms, e.g., via a coated 
condom. Ointments and creams may, for example, be formulated with an 
aqueous or oily base with the addition of suitable thickening and/or gelling 

1 0 agents. Lotions may be formulated with an aqueous or oily base and will in 
general also contain one or more emulsifying agents, stabilizing agents, 
dispersing agents, suspending agents, thickening agents, or coloring agents. The 
active ingredients can also be delivered via iontophoresis, e.g., as disclosed in 
U.S. Patent Nos. 4,140,122; 4,383,529; or 4,051,842. The percent by weight of a 

1 5 therapeutic agent of the invention present in a topical formulation will depend on 
various factors, but generally will be from 0.01% to 95% of the total weight of 
the formulation, and typically 0.1-25% by weight. 

When desired, the above-described formulations can be adapted to give 
sustained release of the active ingredient employed, e.g., by combination with 

20 certain hydrophilic polymer matrices, e.g., comprising natural gels, synthetic 
polymer gels or mixtures thereof. 

Drops, such as eye drops or nose drops, may be formulated with an 
aqueous or non-aqueous base also comprising one or more dispersing agents, 
solubilizing agents or suspending agents. Liquid sprays are conveniently 

25 delivered from pressurized packs. Drops can be delivered via a simple eye 

dropper-capped bottle, or via a plastic bottle adapted to deliver liquid contents 
dropwise, via a specially shaped closure. 

The therapeutic agent may further be formulated for topical 
administration in the mouth or throat. For example, the active ingredients may 

30 be formulated as a lozenge further comprising a flavored base, usually sucrose 
and acacia or tragacanth; pastilles comprising the composition in an inert base 
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such as gelatin and glycerin or sucrose and acacia; mouthwashes comprising the 
composition of the present invention in a suitable liquid carrier; and pastes and 
gels, e.g., toothpastes or gels, comprising the composition of the invention. 

The formulations and compositions described herein may also contain 
5 other ingredients such as antimicrobial agents, or preservatives. Furthermore, 
the active ingredients may also be used in combination with other therapeutic 
agents, for example, oral contraceptives, bronchodilators, anti- viral agents, 
steroids and the like. 

The invention will be further described by the following non-limiting 
10 examples. 



15 Example 1 

Mutant Preparation and Characterization 
Materials and Methods 

Strains, Media, Crosses and Transformation . C4 (Toxl + ; MAT-2) and C5 
(ToxF; MAT-1) are members of near-isogenic C. heterostrophus strains (Leach 

20 et al., 1982, supra). R.C4.2696 (Tox + ; MAT-2; hygB R ) is a C4-derived mutant 
generated using the REMI mutagenesis procedure (Lu et al., Proc. Natl. Acad. 
Sci. USA . 91:12649 (1994)). Strains 1301R33 (Tax; MAT-2; hygB R \ 1301R45 
(Tox~; MAT-1; hygB R ) 1301R26 (Tox + ; MAT-2; hygB R ) are progeny of the cross 
C5 X R.C4.2696. Culture media, including CM (complete medium), CMX 

25 (complete medium with xylose instead of glucose), CMNS (CM with salts 
omitted), and MM (minimal medium) have been described, as have mating 
procedures (Leach et al., 1982, supra; Turgeon et al., Mol. Gen. Genet. , 201 :450 
(1985)). All strains were grown at 24°C under the warm white light or black 
light (F40/350BL) (Sylvania Inc., Danvers, MA). Ascospore germination was 

30 done at 32 °C in the dark for 3 days. REMI transformants were purified by 
transferring the transformants from the original REMI plates to fresh CMNS 
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medium containing hygromycin B (Calbiochem R ) at 80 i g/ml. For conidiation, 
stable transformants were transferred to CMX containing the same drug but at a 
higher concentration (120 i g/ml) to compensate for reduced drug activity due to 
the inhibition by the salts in the medium. Single conidia were picked up under a 
5 dissecting microscope and grown on CMNS hygromycin B plates; stable 
colonies were then transferred to individual CMX/hygromycin plates. All 
purified transformants were stored at -70 °C in CM liquid medium containing 
25% of glycerol in 96-well microtiter dishes. 

Bioassavs . Fungal strains were grown on CMX plates (100 X 15 mm) for 

10 7-10 days at 24 °C under the light for maximum conidiation. To verify normal 
T-toxin production by a race T isolate, 1 .0 ml of T-toxin-sensitive E. coli 
(DHSa) cells were evenly spread on LB medium containing ampicillin (100 
i g/ml) and the plates were allowed to air dry for 30 minutes in a laminar hood. 
Agar plugs bearing fungal mycelia were inoculated (upside down) onto the E. 

1 5 coli cell lawn and the plates were incubated at 32 °C. Wild type race T and race 
O were used as controls for each assay plate. T-toxin-producing strains of the 
fungus will inhibit growth of the E, coli cells and produce halos. Tox mutants 
can be distinguished from wild type by failure to produce a halo (tight) or by 
production of halos smaller (leaky) or larger than wild type (overproducing). All 

20 Tox mutants were transferred to Fries medium (Pringle et al., Phytopathology . 
47:369 (1957)), which optimizes toxin production, and retested. 

T-cytoplasm corn plants (inbred W64A) are used to verify the Tox 
mutants identified from the E. coli assay using the procedure described below. 
Mutants defective in T-toxin production fail to produce typical race T symptoms 

25 on T-corn. Pathogenicity phenotype on N-cytoplasm com and virulence of Tox 
strains to T-cytoplasm corn were determined by a plant assay where, about 3,000 
transformants generated using the REM! mutagenesis procedure (Lu et al., Proc. 
Natl. Acad. Sci. USA . 91:12649 (1994)) were screened for mutants defective in 
ability to cause disease on com plants. Two week old N-cytoplasm com plants 

30 (inbred W64A) grown in the green house (5-6 plants in one 4" X 6" pot) were 
inoculated with 5 ml conidial suspensions (10 5 conidia/ml) using a pressurized 
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Preval Spray Gun Power Unit thin layer chromatography sprayer (Alltech 
Associates, Deerfield, IL), incubated in the mist chamber for 24 hours (23 °C) 
and then taken to the growth chamber (23 °C, 80% humidity, 14 hours of light). 
The mutant phenotypes were determined by occurrence of apparent variations in 
5 disease symptom development, mainly by lesion size comparison. Mutants 
producing lesions smaller than wild type were retested and lengths of typical 
lesions from each mutant were compared with wild type 7 days after inoculation 
and measurements were taken for statistical evaluation. 

DNA manipulations and sequencing . Genomic and plasmid DNA 

10 preparation, restriction enzyme digestions, gel electrophoresis and gel blot 
analysis were done using standard protocols (Sambrook et al., Molecular 
Cloning: A Laboratory Manual . 2nd Ed., Cold Spring Harbor, New York:Cold 
Spring Harbor Laboratory Press (1989)). DNA was sequenced at the Cornell 
DNA Sequencing Facility using TaqCycle automated sequencing with 

15 DyeDeoxy terminators (Applied Biosystems, Foster City, CA). pUCATPH was 
used for subcloning (Table 1). Primers used for sequencing (Table 2) were 
designed using Primer Select (DNASTAR Inc., LaserGene System) and 
synthesized by the Cornell Oligonucleotide Synthesis Facility. Sequencing of 
each plasmid clone was initiated with vector-specific primers or primers 

20 designed to previously determined sequences. Sequences obtained were 

analyzed using the same system and nucleotide or protein database searches were 
performed with the BLAST program (Altschul et al., J. Mol. BioL , 215:403 
(1990)). 



25 
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Table 1. Transformation vectors and clones used. 



Plasmid 


Length 
(kb) 


Characteristics (See U.S. application 
benal JNos. o(J/zD2 ? o4y and 
60/252,732) 


pUCATPH 


5.1 


See Figure 14 in U.S. application 
Serial No. 60/252,649. 


PUCATPHN 


4.6 


Cloning vector, same as pUCATPH 
but lacking a 420 bp Narl fragment 
containing the Hindlll site 


p214B7 


92 


A clone containing pUCATPH 
recovered from the tagged site in 
mutant R.C4.2696 by religation of 
i?g/II-digested genomic DNA 


p214Ml 


63 


As above but with Mscl-digested 
genomic DNA 


p214Sl 


93 


As above but with ^Sad-digested 
genomic DNA 


P214S1N 


33 


Narl fragment derived from 214S1 
containing a 0.8 kb Narl-Sad 
fragment of genomic DNA ligated to 
pUC18 


p214SNP 


SA 


Vector for targeted integration 
constructed by ligating Hindlll- 
digested pUCATPH into the Hindlll 
site ofp214SlN 


pll8BSP 


13 


Vector for targeted integration 
constructed by ligation of a 2.2 kb 
Sad fragment of pi 1 8BC4 into the 
Sad site of pUCATPH 


pll8BCS 


5A 


Vector for targeted integration 
constructed by ligation of a 0.8 kb 
Sspl fragment of pi 1 8BC4 into the 
Sspl site of pUCATPHN 


pll8B14 


10.4 


A clone recovered from the 
p214SNP integration site in 
transformant #fl 18 by ligation of a 
.tfgyll-digested genomic DNA 
fragment containing the entire vector 


pll8BC4 


6J_ 


A clone recovered from same site as 
above but by ligation of a Bcll- 
digested genomic DNA fragment 
containing part of vector (214SNP) 
sequence 
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Plasmid 


Length 
(kb) a 


Characteristics (See U.S. application 
Serial Nos. 60/252,649 and 


p9P2 


73 


A clone recovered from the pi 1 8BSP 
integration site in transformant #9 by 
ligation of a PM-digested genomic 
DNA fragment containing pUCl 8 


pl2H6 


8ti 


A clone recovered from the 
pl 18BCS integration site m 
transformant #12 by ligation of a 
77m<iIII-digested genomic DNA 
fragment containing the entire 
vector. 



a An underlined kb number indicates that the plasmid carries genomic DNA 
sequences. 



5 Table 2 . Primers used for sequencing recovered genomic DNA flanking the 
REMI insertion site at the R.C4 2696 mutation. 



Name 3 


Position 15 


Sequence 0 


Plasmid d 


Origin 6 


M13RMT 




SEQ ID NO:4 


A 


pUC18 


l.RPlb 


775 


SEQ ID NO:5 


A 


214B7TrpC 


2. RP2 


604 


SEQ ID NO:6 


A 


214B7RPlb 


3.RP3 


119 


SEQ ID NO:7 


A 


214B7RP2 


4. RP4 


-232 


SEQ ID NO:8 


A 


214B7RP3 


5. RP5 


-812 


SEQ ID NO:9 


A 


214B7RP4 


6. RP5b 


-1215 


SEQ ID NO: 10 


A 


214B7RP4 


7.RP6 


-1392 


SEQ ID NO: 11 


A 


214B7RP5 


8. RP7 


-1839 


SEQ ID NO: 12 


A 


214B7RP6 


TrpC 




SEQ ID NO: 13 


A 


PUCATPH 


9. FP1 


1885 


SEQIDNO.14 


A 


214B7TrpC 


lO.FPlb 


1828 


SEQ ID NO: 15 


B 


214B7TrpC 


11.FP2 


2028 


SEQ ID NO: 16 


B 


214MlFPlb 


12. FP3 


2490 


SEQ ID NO: 17 


C 


214M1FP2 


13.FP4 


2949 


SEQ ID NO: 18 


C 


214S1FP3 


14. FP4B 


2745 j 


SEQ ID NO: 19 


C 


214S1FP4 


15.FP5 


3421 


SEQ ID NO:20 


C 


214S1FP4 


16.FP6 


3948 


SEQIDNO:21 


C 


214S1FP5 


17. FP7 


4411 


SEQ ID NO:22 


C,D 


214S1FP6 


18.FP8 


5035 


SEQ ID NO:23 


D 


118B14FP7 
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Name a 


Position 15 


Sequence 0 


Plasmid d 


Origin 6 


19. FP9 


5457 


SEQ ID NO:24 




118BC4FP8 


20. RP48 


2865 


SEQEDNO:25 


D 


214S1FP6 


21.FP10 


5790 


SEQ ID NO:26 


F 


9P2FP9 


22. FPU 


6327 


SEQ ID NO:27 


F 


9P2FP10 


23.FPllb 


6211 


SEQ ID NO:28 


F 


9P2FP10 


24. FP12 


6457 


SEQ ID NO:29 


F 


9P2FP11 


25. FP13 


6854 


SEQ ID NO:30 


F 


9P2FP12 


26. FP14 


7400 


SEQIDNO:31 


F 


9P2FP13 


27. FP15 


7771 


SEQ ID NO:32 


F 


9P2FP14 


28. FP16 


8145 


SEQ ID NO:33 


F 


9P2FP15 


29. FP17 


8492 


SEQ ID NO:34 


F 


9P2FP16 


M13F40 




SEQ ID NO:35 


G 


pUC18 


30.RP1 


8953 


SEQ ID NO:36 


G 


9P5M13F4 


31.RP2 


8559 


SEQ ID NO:37 


G 


9P5RP1 



a «Rp 9? i n di ca tes reverse primer; "FP" indicates forward primer. Primers 
designed to genomic DNA sequences are numbered in order. Primers 1-17 
have a leading number "214"; 18-20 with "118"; 21-29 with "9P2" and 30- 
5 31 with "9P5". M13RMT (a M13R mutant version; there is a mutation in the 

polylinker of pUC18) and M13F-40 were provided by Cornell DNA 
Sequencing Facility. TrpC primer site is in the pUCATPH TrpC promoter 
region 38 bp from SaR site with sequencing direction from SaR to Kpnl. 

b The position of the first base of each primer corresponds to the assembled 
1 0 sequence (CPS1 + TES1, total 1 1 .3 kb). 

c Each primer sequence is given in the 5 ! to 3 ! direction. 

d Plasmids used as templates for each sequencing reaction. A= p214B7; 

B=P214M1; C=p214Sl; D=pll8B14; E=pll8BC4; F=p9P2; G=p9P5 (=9P2) 

e Original sequences that were used for primer design. 

15 

Results 

Recovery of tagged DNA from the REMI insertion site and targeted gene 
disruption . Genomic DNA of mutant R.C4.2696 was digested with Bgfll, Mscl 
(no sites in pUCATPH) or Sacl (which cuts the vector once) and purified by 
20 phenol extraction and ethanol precipitation, then dissolved in TE (pH 8.0). 

Ligation was performed in 50 pi reaction mixture, containing 1 x T4 DNA ligase 
buffer with 10 mM ATP, 60 units T4 DNA ligase (New England Biolabs, 
Beverly, MA) and 3 |ug of i?gill-digested genomic DNA, at 14 °C overnight. Ten 
[A of ligation mixture was used to transform 200 jllI of competent DH5a cells, 
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prepared using the calcium chloride treatment (Sambrook et al., 1989, supra) to 
ampicillin resistance. Ampicillin resistant clones were analyzed by digestion of 
plasmid DNA with several diagnostic restriction enzymes and clones containing 
the REMI vector plus flanking genomic DNA were sequenced using the vector- 
5 specific primers (M13R or TrpC). Three plasmids, p214B7, p214Ml and p214Sl 
were recovered and used for sequencing. p214B7 contains 4.2 kb flanking DNA 
(3.4 left; 0.7 right); p214Ml contains 0.1 kb left flank that overlaps with p214B7 
and 1.1 kb right flank that overlaps with p214Sl, which contains 3.2 kb flanking 
DNA on the left only. 

1 0 For targeted gene disruption in wild type, p214B7 was amplified and 

plasmid DNA purified by equilibrium centrifugation in CsCl-ethidium bromide 
gradients (Sambrook et al., 1989, supra). Thirty |LXg of plasmid DNA (linearized 
with BgRl for double crossover integration) were used to transform wild type and 
the transformants were purified by isolation of single conidia, assayed for 

1 5 pathogenicity and characterized by gel blot analysis. 

Sequence extension by targeted integration and plasmid rescue . Two 
overlapping cosmid clones were isolated by probing a genomic DNA library of 
C4 constructed on a cosmid vector, but both extended into the left region only of 
p214B7. To extend to the right, a chromosome walking strategy was employed. 

20 Three targeted gene disruption experiments (each followed by plasmid rescue) 
were done successively. In the first experiment, a vector was constructed as 
follows: p214Sl was digested with JVarl and religated to create p214SlN, which 
was then digested with Hindlll and ligated into the Hindlll site of pUCATPH to 
create p214SNP for transformation of race O (C5). One transformant (Txll 8) 

25 resulting from homologous integration (confirmed by gel blot analysis) was used 
for plasmid rescue as described above. Two new plasmids p i 1 8B14 and 
pi 1 8BC4 were recovered, both of which carry sequence at the 3' end but only 
172 and 680 bp more than p214Sl, respectively. To continue the walk, pi 1 8B14 
was digested with Sad and ligated into the Sacl site of pUCATPH to create 

30 pi 1 8BSP. This vector was linearized with BgUI and transformed into wild type 
and one plasmid, p9P2 was recovered (from transformant Tx9), which extends 
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4.4 kb into the region 3' of pll8BC4 and contains the 3' end of CPSL The 
recovered plasmid p9P2 includes the entire pUC18 sequence on pi 18BSP and 
4.6 kb of genomic DNA that contains all of ORF1 (CPS1), including the stop 
codon (TAG) and 3.0 kb of genomic region 3' of the stop codon. A third 
5 experiment was done in an attempt to recover a 1 5 kb Xliol fragment at the 3 f end 
of that tagged gene, pi 1 8BCS was constructed by subcloning a 0.8 kb Sspl 
fragment into the same site pUCATPHN. Plasmid rescue using Xhol digested- 
genomic DNA of a transformant (TX12) failed to recover the 1 5 kb Xhol 
fragment, but pl2H6 was recovered using i/m^III-digested genomic DNA of the 

1 0 same transformant; the genomic DNA matched that already cloned on p9P2. 

Characterization of the REMI mutant . In all culture conditions used, 
mutant R.C4.2696 grew just like wild type with no variations in growth rate, 
color and morphological features. It produces normal appressorium-forming 
conidia that germinate and form infection structures like wild type when induced 

15 on artificial surfaces and shows normal mating ability when crossed to wild type 
testers. No pleiotropic phenotypes associated with the mutation have been 
detected so far. The mutant differs from wild type in the ability to cause disease 
on corn plants. 

The lengths of 100 typical lesions from corn leaves inoculated with wild 
20 type race O and a mutant progeny R45 (Tox\ hygB R ) carrying the R.C4.2696 
mutation were measured 7 days after inoculation and values plotted. 
When tested on T-cytoplasm com, the mutant produces race T type symptoms 
but the disease develops more slowly than with wild type although it produces 
wild type levels of T-toxin as detected in a microbial assay, suggesting that the 
25 reduced virulence is not related to a deficiency in the ability to produce T-toxin. 
This is clearer on N-cytoplasm com where the mutant produces lesions 
significantly smaller than those produced by wild type. When the mutant was 
crossed to a wild type race O tester, the small lesion phenotype and ability to 
produce T-toxin segregated independently, indicating that mutant phenotype is 
30 not associated with the reduced fitness trait tightly linked with the Toxl locus 
(Klittich et al., Phytopathology , 76:1294 (1986)). The statistical evaluation of 
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lesion size in the wild type race O genetic background indicates that the mutation 
causes 60% reduction in the fungal virulence to corn plants. Table 3 depicts the 
statistical analysis that 86% of the mutant lesions are less than 4 mm in length 
(average size of 3.5 mm), 60% reduced compared to that of wild type (8.5 mm). 

5 

Table 3 











Lesion size (mm) 




Strain 




Frequency 




Mean 


SD 






1-4 


5-8 


9-12 








WT 


0 


52 ' 


48 


8.5 


1.0 


A* 


R45 


86 


14 


0 


3.5 


0.9 


B 



^Significant difference at P < 0.01. 

10 The mutant phenotype is caused by a tagged, single site mutation. In 

crosses between the mutant and wild type testers, progeny segregated 1:1 for 
parental types only and all hygromycin B-resistant progeny produced lesions 
similar to the mutant parent; all hygromycin B-sensitive progeny produced wild 
type lesions, indicating that a tagged mutation is responsible for the reduced 

1 5 pathogenicity of the mutant. Table 4 depicts the progeny segregation data. 

Table 4 

20 Parental type Nonparental type 

Cross Progeny path PATH path PATH 

0 . hygB R hygB s hygB R hygB s 



R.C4.2696 x C5 


random spores 


24 


22 


0 


0 


1301-R33*xC5 


tetradl 


4 


4 


0 


0 




tetrad2 


4 


4 


0 


0 




tetrad3 


4 


4 


0 


0 




Random spores 


21 


22 


0 


0 



*13012-R33 (path, hygB R , Tox", MAT-2) is a progeny from the first 
cross, carrying the R.C4.2696 mutation. 
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Example 2 

Cloning, Sequencing and Characterization of DNA Flanking the 
REMI Vector Insertion Site 
5 A total of 1 1 .3 kb of genomic DNA surrounding the insertion site was 

cloned and completely sequenced (SEQ ID NO: 59; Figure 2). The sequence was 
derived from seven plasmid clones. The first three (p214B7, p214Ml and p214Sl) 
were recovered from the tagged site in mutant R.C4.2696 and cover about 60% 
(6.6 kb) of the entire region. The rest (pi 18B14, pi 18BC4 3 p9P2 and pl2H6) 

1 0 were recovered from transformants generated using the chromosome walking 
strategy. DNA to the left of the insertion site (3.4 kb) was cloned on p214B7; 
DNA on the right (7.9 kb) was cloned on different overlapping plasmids. p9P2 
carries the largest amount (4.6 kb) including genomic DNA on pl2H6. 

Analysis of the combined sequences revealed two open reading frames 

15 (ORFs). ORF1 (5.4 kb) starts 576 bp upstream of the REMI vector insertion site 
and ends with an in-frame stop codon (TAG) 3029 bp from the end of the 
sequenced region in the right flank. No "TATA" box-like element is found in 
the expected position, but five putative "CAAT" boxes are located upstream of 
the start codon (ATG), three of them are in the range found in most filamentous 

20 fungal promoters (60-200 bp) (Gurr et al., 1987, infra). Sequence around ATG 
of ORF1 (CACCATGCT) (SEQ ID NO:38) is similar to the fungal consensus 
(CACCATGGC) (SEQ ID NO:39). Although there are several ATGs found 
upstream, they are less likely to be used as a start codon because the surrounding 
sequences lack similarity to the consensus. Three putative introns are identified 

25 by their conserved 5 T and 3' border sequences and potential branch sites (Table 
5). Splicing these introns eliminated stop codons which would otherwise 
interrupt the 5.4 kb open reading frame. Three introns have similar size (45-53 
bp respectively) which is in the range of intron size determined from most fungal 
genes. A putative polyadenylation signal (ATAA) is found 223 bp downstream 

30 of the translation termination site. 

The G+C content of ORF1 is 51.5%, which is similar to most 
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Cochliobolus genes (Turgeon et aL, Mol. Gen. Gene.. 238:270 (1993); VanWert 
et aL, Curr. Genet. , 22:29 (1992); Yang et aL, Plant Cell , 8:2139 (1996); Rose et 
aL, 1996, supra). Interestingly, ORF1 is flanked by two regions of G+C rich 
DNA. The first (1.4 kb, 60.3% G+C) is found between ORF1 and ORF2; the 
5 second (1 .2 kb, 60.3% G+C) is found 1 .8 kb downstream of the stop codon of 
ORF1 . Database searches using the translated protein sequence of ORF1 
revealed high similarity to SafB, one of the multifunctional enzymes catalyzing 
the biosynthesis of the cyclic peptide antibiotic saframycin Mxl produced by the 
bacterium Myxococcus xanthus (Pospiech et aL, Microbiology , 142:7 r 41 (1996)). 

10 The entire nucleotide sequence of ORF1 (CPS1) is designated SEQ ID NO:2 
(6,550 base pairs from the 1 1 .3 kb sequenced region, Figure 2). The deduced 
amino acid sequence of CPS1 protein is designated SEQ ID NO:3. A 
modification of the ChCPSl sequence, including changes in three base pairs 
("ATG" added between positions 5349 and 5350 of the GenBank entry 

15 (GenBank Accession number AF332878)) and an addition of 3 1 amino acids (the 
first thirty amino acids ("MMGNYAFNPDNQQSYDGQFGSPGEASRRST") 
were added at the N-terminus based on the selection of a new start codon and an 
additional methionine ("M" at position 1489 was missing in the Genbank entry)) 
is designated SEQ ID NO:50 (6553 base pairs). The deduced amino acid 

20 sequence of the modified ChCPSl protein is designated SEQ ID NO:185 (1774 
amino acids; revised version of the original CPS1 protein (GenBank Accession 
number AAG53991)). The open reading frame is 5,474 base pairs (736-6209), a 
93 base pair increase compared to the deposited sequence that was 5,381 bp. A 
new start codon (position 736, the original one at position 826) was proposed 

25 based on the amino acid alignment of several CPS1 orthologs from different 

fungi that revealed conserved residues in this region. The stop codon (6,209) is 
the same as the original GenBank sequence. 
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Table 5 . Characteristics of putative introns in CPS J and TES1 



Gene 


Litron 


Size 
(bp) 


Location 


5* Border 


3' 

Border 


Branch Site 


CPS1 


I 


45 


3060-3105 


GTAAGT 


TAG 


GTCTAAC 




II 


51 


4532-4582 


GTAAGT 


GAG 


TGCTAAC 




m 


53 


5187-5239 


GTACGT 


CAG 


TACTAAC 


TES1 


i 


49 


528-566 


GTAAGT 


TAG 


CCTTAAG 


Cons 








GTA A /C GT 


T /C AG 


YNCTAAC* 



ORF2 starts about 1.6 kb upstream of the start codon of CPS1 and is 
5 transcribed in the opposite direction (Figure 2). No "TATA" box-like element 
and CAAT box are found; instead, an AT-rich sequence "AAAACTAT" is 
located 1 1 bp upstream of the start codon ATG and a CT motif is found in the - 
30 region, which is characteristic of a number of fungal genes that lack a CAAT 
box in their promoter region (Gurr et al ., In: Gene Structure in Eukarvotic 

10 Microbes , Vol.22, published by the Society for General Microbiology, Oxford, 
England: IRL Press, Kinghorn, ed., pp 93-140 (1987)). The sequence around 
ATG matches perfectly fungal gene consensus. A putative intron (50 bp) is 
found in the middle of ORF2 with conserved 5 ! and 3 f border sequences and a 
potential branch site (Table 5). A putative polyadenylation signal (AAATA) is 

15 found 189 bp downstream of the translation stop codon TGA. The G+C content 
of ORF2 is 55.5%, which is slightly higher than the normal range because the 5 ? 
end of ORF2 is located in the region of G+C rich DNA upstream of ORF1 . 
Database search revealed that ORF2 encodes a protein with high similarity to 
Homo sapiens thioesterase II (hTE, Liu et al., J. Biol. Chem. . 272:13779 (1997)) 

20 and E. coli thioesterase II encoded by the tesB gene (Naggert et al., J. BioL 
Chemu 266:11044 (1991)). The nucleotide sequence of ORF2 (TES1) is 
designated SEQ ID NO:57. The deduced amino acid sequence of the TES1 
protein is designated SEQ ID NO:58. 

Modular structure of CPS 1 . Predicted CPS1 protein (1743 amino acids, 

25 M r 193235) contains two structurally similar modules, both of which are similar 
to SafBl, the first module of saframycin synthetase B (overall 25% identity; 50% 
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similarity) and have apparent amino-acid-activating and thiolation domains but 
lack methyltransferase activity, thus appearing to be typical type I modules 
(Figure 3). The number of amino acids in each module is different: the first 
module (CPS1A) consists of 574 amino acids (from the first residue of core 1 to 
5 the last residue of core 6), which is larger than most type I modules; the second 
module (CPS1B) has 530 amino acids, which is average. The distance between 
the two modules is 193 amino acids, much shorter than most peptide synthetases 
(500-600 amino acids), but this distance is not highly conserved, i.e., an opposite 
variation is found in HC-toxin synthetase and cyclosporine synthetase, both of 

10 which have about 1,000 amino acids between the first and second amino-acid- 
activating module (see Table 6F). 

Tables 6A-F show a comparative alignment of core amino acid sequences 
in CPS1 A and CPSIB with those of other peptide synthetases. In each of Tables 
6A-F, the first column shows the names of peptide synthetases; the second 

1 5 indicates the position of the first residue aligned in the original amino acid 

sequence of each protein; the last column on the right indicates the number of 
amino acids between two cores (6A-E, in parentheses) or the distance between 
two adjacent amino-acid-activating modules (Table 6F, in parentheses). The 
extra column in 6F, shows the total number (underlined) of residues in each 

20 amino-acid-activating module in which the aligned core sequence is located. 
The consensus of each core sequence is on the top, which includes identical or 
similar residues found in all peptide synthetases or with only a few exceptions 
(active site also indicated by asterisks). SafBl : the first module in saframycin 
Mxl synthetase B of Myxococcus xanthus (Genbank Accession No. U24657); 

25 GrsA: gramicidin S synthetase A of Bacillus brevis (SWISS PROT Accession 
No. P14687); HTS1A and HTS1B: the first two modules in HC-toxin synthetase 
of Cochliobolus carbonum (Q01886); EsynA and EsynB: two modules in 
enniatin synthetase of Fusarium scirpi (EMBL Accession No. Z18755); ACVA 
and ACVB: the first two modules in ACV synthetase of Aspergillus nidulans 

30 (SWISS PROT P19787); CysnA and CsynB: the first two modules in 
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cyclosporine synthetase of Tolypocladium nivenm (EMBL Accession No. 
Z28383). 
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Amino acid alignment of the two modules of CPS1 to SafBl indicated 
that these modules are highly similar to each other in both overall amino acid 
composition and conserved motif sequences as defined by Stachelhaus and 
Marahiel (Stachelbaus et al., 1995, supra; Marahiel, 1997, supra). When aligned 
5 to other bacterial or fungal peptide synthetases, CP SI only showed local 

similarity to cyclosporine synthetase (Weber et al., Current Genetics , 26(2): 120 
(1994)) and tyrocidine synthetase A (Mootz et aL, J. BacterioL , 179(21):6843 
(1997)), but when the amino acids in motif regions were aligned, a overall 
conservation was observed. Both CPS1A and CPSIB have all five core 

10 sequences in the amino - acid- activating domain (Table 6A-E). Cores 3 and 4 are 
well conserved except for the replacement of an aspartic acid residue of core 4 
by a leucine in CPS1A. Cores 1, 2 and 5 show weak conservation, but similar 
variations are also seen in SafBl. A thiolation domain is found in both modules, 
which contains a highly conserved motif (core 6, Table 6F). The serine residue 

15 in this motif has been shown to be the active site for 4'-phosphopantetheine 

attachment (Schlumbohm et al., J. Biol. Chem. , 266:23135 (1991); Stein et al., 
FEBS Lett. , 340:39 (1994)). 

The distances between the six core sequences in the two modules are also 
largely conserved. Two exceptions are found in the first module, which has 312 

20 amino acids between cores 2 and 3, larger than normal (150-200); 61 between 
cores 5 and 6, only half of that of most peptide synthetases. SafBl also shows 
distance variations at these two interval regions (Table 6B and E). In addition to 
amino-acid-activating and thiolation domains, CP SI also has an integrated 
thioesterase domain (TE) in the carboxy-terminal end of CPSIB (Figure 12). A 

25 signature sequence GXSXG (SEQ ID NO: 147), which is highly conserved in 

animal fatty acid thioesterase type II enzymes and several peptide synthetases, is 
found in this domain (Table 7). 
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Sequence homology analysis of TES1 protein . The predicted TES1 
protein consists of 367 amino acids (M r 41013) amino acid alignment of TES1 to 
hTE, TESB and Mycobacterium tuberculosis TESB homolog (Philipp et al. ? 
Proc. NatL Acad. Sci. USA , 93:3132 (1996)) showed that these proteins have an 
5 overall 40% identity and 60% similarity. A highly conserved VHS motif 

(putative active site) is found in the C-terminal region of TES1 at a conserved 
position (Figure 13). All these thioesterases have no sequence similarity with the 
previously identified animal type I or type II thioesterases known to be involved 
in the chain termination of fatty acid synthesis (Naggert et al., J. Biol. Chem. . 

10 266:11044 (1991)). Interestingly, TES1 has more homology to hTE than to two 
bacterial genes, suggesting that both proteins belong to a new family of 
eukaryotic thioesterases. 

Targeted disruption of CPS 1 . Disruption of either CPS 1 A or CPS IB 
restored the original mutant phenotype. Ten transformants from each of four 

15 individual disruption experiments using different constructs, including the 
plasmid recovered from the REMI insertion site in the mutant (p214B7) and 
three vectors for chromosome walking (p214SNP, pi 18BSP and pi 18BCS) were 
purified and assayed on N-cytoplasm corn. All transformants showed the same 
small lesion phenotype as that of the original REMI mutant. Southern blot 

20 analysis confirmed that all transformants showing the mutant phenotype resulted 
from homologous integration of the transforming vector that disrupted the wild 
type CPSL No transformants showing the wild type phenotype were obtained, 
presumably because of the large genomic DNA fragments (over 800 bp in all 
disruption experiments) on the transforming vector that resulted from high 

25 efficiency of homologous recombination and the low chance to recover 
transformants with ectopic integration. 
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Example 3 

Targeted disruption of CP SI homolog in C. victoriae 
Methods and Materials 

Strains, growth conditions and transformation . Strains of Cochliobolus 
5 species and relatives used for genomic DNA hybridization are listed in Table 8. 
The strain HyW, a victorin-producing isolate of C. victoriae was recovered from 
storage and grown on CMX medium (Turgeon et al. 5 Mol. Gen. Genet . 201:450 
(1985)) for conidiation or on oat meal agar medium (Churchill et al. ? Fungal 
Genet. Newsl. . 42A:41 (1995)) for victorin detection at 24°C under warm white 
10 lights (Sylvania Inc., Danvers, MA). Transformation was done using the C. 
heterostrophus procedure (Turgeon et al., Mol. Gen. Gene. , 238:270 (1993)). 



Table 8 . Detection of CPS1 homologs in Cochliobolus spp and relatives 



Strain a 


Host b 


EcoRI 
digest 0 


Hybridization 
Hindlll digest d 


BgRl 
digest 0 


C. heterostrophus 


Corn 








race T CC4) 


(Turf-13) 


+ 


5.2 3.2 


4.2 


race O (C5) 




+ 


5.2 3.2 


4.2 












C. carbonum 


Corn 1 








race 1 f26R13~) 


(hmlhml) 


+ 


6.6 


5.0 


race 2 ( Yug Y) 




N 


6.6 


5.0 


race 3 (BZ1703)* 




N 


6.6 


5.0 












C. victoriae CHvW") 


Oats (Vb) 


+ 


N 


5.0 


C. sativas (A20) 


Grasses 2 


+ 


3.0 


N 


C. specif er (D5-7) 


Grasses 2 


+ 


N 


N 


C. hornomorphus 
(ATCC 13409) 


Unknown 


N 


5.8 


N 


C. dactyloctenii 
(7938-9) 


Unknown 


N 


5.9 


N 


S. turcica (NK2) 


Sorghum and 
maize 3 


+ 


N 


N 


S. rostrata (32197) 


Weeds and 
bamboo 4 


+ 


2.8 


N 


B. sacchari 


Sugarcane 5 
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(764-n 




+ 


5.4 2.5 


N 


(1249-10) 




N 


5.4 2.5 


N 



a ' C = Cochliobolus. S = Setosphaeria. B = Bioplaris. The name of isolates 
(or lab strains) of each species are given in parentheses and those known 
to produce host-specific toxins are underlined. ^Provided by Tsukiboshi 
5 Takao (Japan) and the isolate could be either BZ1 209 or BZ1703 . 

b ' Genotype susceptible to the host-specific toxin-producing isolate is given 
in parentheses. References for hosts of those species not mentioned are 
as follows: 1: Welz et aL, Phytopathology , 83:593 (1993); Leonard et aL, 
Phytopathology , 80:1 154 (1990) (for races 2 and 3 only). 2: Domsch et 
10 aL, "Compendium of Soil Fungi, Vol. 1 New York, New 

York:Academic Press, pp 216-222 (1980). 3: David et al., "Fungi on 
Plants and Plant Products," St. Paul, Minnesota:APS Press, p. 635 
(1989); Thakur et al., Plant Pis. , 73:151 (1989). 4: Rao et al., Indian Bot. 
Rep. . 6:38 (1987): Bhat et al., Curr. SCI. (BANGALORE) . 58:1148 
15 (1989). 5: Yoder. Ann. Rev. PhvtopathoL 18:103 (1980). 

c ' Genomic DNAs (from a previously prepared gel blot filter, Rose et aL, 
1996, supra) were probed with the 3.4 kb CPSI fragment cloned on 
p214B7. indicates a strong hybridization signal. All species 
hybridized to a large fragment (about 23 kb). 
20 d * Genomic DNAs selected from a collection were probed with the CPSI 

3.2 kb fragment cloned on p214Sl . The size of fragments that hybridized 
to the probe is given in kb. The intensities of hybridization signals were 
similar to each other. N = not done. 
e * Genomic DNAs were probed with the same CPSI fragment as in c. 

25 

DNA manipulations and targeted disruption of the CPS1 homolog of C 
victoriae . Genomic DNAs for probing were prepared according to Yoder, In: 
Genetics of Plant Pathogenic Fungi. Vol. 6, San Diego, California:Academic 
Press, Sidhu, ed., pp. 93-112 (1988)), or selected from a lab DNA collection 

30 (stored at 4°C). A gel blot filter bearing known genomic DNAs was also 
probed. Plasmid DNA preparation, restriction enzyme digestions, gel 
electrophoresis, gel blot analysis were done using standard protocols (Sambrook 
et aL, 1989, supra). For probing, CPSI fragments of C. heterostrophus cloned on 
p214B7 (3.4 kb left flank) and p214Sl (3.2 kb right flank) were prepared by 

35 restriction enzyme digestion of the plasmid DNAs followed by purification using 
the QIAquick Gel Extraction Kit (QIAGEN Inc., Chatsworth, CA). The plasmid 
pl8B14, which carries the 2.3 kb Bglil fragment of CPSI interrupted by the 
hygB cassette was linearized with Bglil and introduced into HvW genome. 
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Traiisformants were purified by isolation of single conidia and genomic DNAs 
were digested with Bglil and probed with the CPS1 3.2 kb fragment. 

Bioassavs . Pathogenicity was determined by an oat plant assay. Fungal 
strains were grown in individual oat meal agar medium plates (60 X 1 5 mm) 
5 containing hygromycin B (60 jig/ml) for 10 days at 24 °C under lights. Conidia 
were scraped from the plates and suspended in 6 ml sterilized distilled water. 
One ml of conidial suspension of each strain was mixed with 60 seeds of 
susceptible or resistant oats. Inoculated seeds were planted in 4" X 6" pots and 
seedlings were allowed to grow for two weeks. Seed germination rate and 
10 symptom development were recorded at different stages (4, 6, 8 and 24 days after 
inoculation). Detection of victorin production using HPLC analysis was done by 
Alice Churchill in Dr. Vladimir Macko's lab at Boyce Thompson Institute for 
Plant Research. 
Results 

1 5 Detection of CPS1 homologs . Genomic DNAs of 12 isolates (or lab 

strains) of 9 fungal species hybridized to CPS1 (Table 8). All 6 Cochliobolus 
species, including 4 known plant pathogens (C. car bo num. C. victoriae, C. 
sativus and C. specifer) and 2 species with unknown hosts (C. homomorphus and 
C. dactyloctenii) gave hybridization signals of the same intensity as that of C. 

20 heterostrophus CPS1 fragments. Two phytopathogenic Setosphaeria species and 
Bioplaris sacchari, a sugarcane pathogen gave a similar hybridization intensity. 

CPS1 homologs appear to be polymorphic among different species, i.e., 
all species gave one or two unique bands when Bgfll or Hindlll digested genomic 
DNAs were probed (except for C. victoriae y which showed the same 

25 hybridization pattern as C. carbonum) (Table 8). Interestingly, EcoRl digested 
genomic DNAs of the same species did not show polymorphisms; all species 
hybridized to a large fragment (about 23 kb, Table 8), indicating the absence of 
an EcoRl site in all CPS1 homologs as in the C heterostrophus gene. In C. 
hererostrophus, a >12 kb of genomic region which includes CPS1 (5.4 kb), 

30 TES1 (1 . 1 kb) and sequence downstream of the 3' end of CPS1 has no EcoRl 

sites. In contrast to species-dependent polymorphisms, CPS1 homologs appear to 
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be highly conserved among different isolates of the same species. Both C. 
heterostrophus race T and race O hybridized to the same 4.2 kb BgM fragment 
(or 5.2 and 3.2 kb Hindlll fragments); all three C. carbonum races hybridized to 
the same 5.0 kb BgUl fragment (or 6.6 kb Hindlll fragment) (Table 8) and B. 
5 sacchari isolates 764-1 and 1249-10 hybridized to the same Hindlll fragments 
(5.4 and 2.5 kb) (Table 8). 

Twenty transformants were obtained from transformation of the victorin- 
producing isolate HvW with i?g-/II-linearized plasmid p 1 1 8B 1 4. Six 
transformants were purified and assayed for both victorin production and 

10 pathogenicity to susceptible oat plants. All transformants produced wild type 
levels of victorin as determined by HPLC analysis, but four of them (Tx7 5 Tx2, 
Tx5 and Tx8) showed dramatically reduced virulence in the plant assay. The 
seed germination rate on the eighth day after inoculation is only 13-25% for wild 
type and two transformants (Tx9 and Tx4) ? but 45-63% for the other four 

15 transformants. One day 24 after inoculation, all plants emerged from the seeds 
inoculated with wild type, Tx9 or Tx4 were killed but most (29-63%) from the 
seeds inoculated with Tx2, Tx7, Tx5 or Tx8 still survived (Table 9). Southern 
blot analysis confirmed that transformants showing the reduced virulence 
phenotype resulted from homologous integration of the transforming vector that 

20 disrupted the wild type CPS1 homolog in C. victoriae genome; transformants 
showing the wild type phenotype resulted from ectopic integration events that 
left the native gene intact. All transformants remained nonpathogenic to 
resistant oats, indicating that disruption of the CPS1 homolog does not affect 
host specificity of the fungus. 

25 

Table 9 . Disease development of oat plants inoculated with C. victoriae 
transformants (Tx). 



Strain 3 


No. germinated 15 
4 6 8 


Germination Rate 
(%) c 


No. survivors d 
24 % 


Control- 1 


28 


41 


45 


75 


75 100 


Control-2 


40 


50 


50 


83 


50 100 


Control-3 


1 


7 


12 


20 


0 0 


Tx2 


8 


26 


27 


45 


16 59 
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Strain a 


No. germinated 13 
4 6 8 


Germination Rate 
(%) c 


No. survivors' 1 
24 % 


Tx4 


5 


15 


15 


25 


0 0 


Tx5 


2 


24 


28 


47 


8 29 


Tx7 


14 


36 


38 


63 


24 63 


Tx8 


7 


29 


29 


47 


13 47 


Tx9 


0 


3 


8 


13 


0 0 



1 Control- 1 = uninoculated susceptible oat seeds. Control-2 and Control-3 
= resistant and susceptible oat seeds inoculated with wild type C 
victoriae (isolate HvW), respectively. Six transformants were tested on 
5 both resistant and susceptible seeds, but only data for the later are shown 

(all transformants gave the same results as Control-2 when tested on 
resistant seeds). Repeat experiments gave similar results (data not 
shown). 

b Sixty oat seeds were used for each strain. Emerged oat plants were 
10 counted 4, 6 and 8 days after inoculation. 

c * Calculation based on the data collected on the day 8. 

Recorded on day 24 after inoculation. The percentage of survivors is 
based on the number of plants recorded on days 8 and 24. 

15 Discussion 

CPS1 encodes an enzyme with an adenvlation domain . A gene 
designated CPS1 was cloned from the corn pathogen C. heterostrophus using the 
REMI mutagenesis procedure. Structural and functional analyses strongly 
suggest that CPS I encodes an enzyme with one or more adenylation domains, 

20 e.g., a CoA ligase. CPS1 contains two repeated functional units with a modular 
organization, and has a thioesterase motif (GXSXG; SEQ ID NO: 147). This 
motif has been demonstrated to be an active site for catalyzing release of 
medium-chain-length (C 8 -i2) fatty acids in fatty acid synthases and potentially for 
termination of peptide chains or for repeated acyl transfer reactions because the 

25 same motif is also the characteristic of acyl transferases or acyl transfer domains 
(AT) of fatty acid synthases (FAS) and polyketide synthases (PKS) (Kratzschmar 
et al., J. Bacteriol. . 171, 5422, (1989)). 

Although similar TE domains are found in certain fungal PKSs, i.e., 
Aspergillus nidulans 'plvsLl gene (Feng and Leonard, J. Bacterid , 177, 6246, 

30 (1995)) and pksST gene (Yu and Leonard, J. Bacteriol. . 117, 4792, (1995)), 
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CP SI is unlikely to be a polyketide synthase because: 1) it does not show any 
significant similarity to known PKSs, and 2) it lacks unique functional domains 
found in these proteins such as the ketoacyl synthase domain (KS) and the acyl 
transferase domains (AT) found in the iV-terminal region of all fungal PKSs 
5 (Yang et al., 1996, supra). This does not exclude the possible common 

evolutionary origin of CP SI and PKSs (Stachehaus and Marahiel, 1995, supra). 

CPS1 could be responsible for biosynthesis of an unidentified peptide 
phytotoxin . It is well known that several Cochliobolus species and related 
filamentous fungi produce peptide toxins. These include C carhonum and C. 

10 victoriae, two species most closely related to C heterostrophus. The former 
produces HC-toxin as mentioned above; the latter produces victorin, a 
chlorinated cyclized peptide. Alternaria alternata, a plant pathogenic species 
from a genus closely related to Cochliobolus, is also known to produce several 
peptide toxins such as AM-toxin, a cyclic tetradepsipeptide produced by A. 

15 alternata apple pathotype and tentoxin, a cyclic tetrapeptide produced by A. 

alternata pv. tenuis (Nishmura and Kohmoto, 1983). These findings have lead 
to the postulation that, in addition to T-toxin, G heterostrophus might also 
produce a similar secondary metabolite, such as a hypothetical "race O" toxin 
(Yoder, 1981). 

20 Interestingly, a Tox + , cpsV mutant showed reduced virulence on T- 

cytoplasm com although it produced the same amount of T-toxin as wild type 
race T. This is unusual because the interaction between T-toxin and the T-corn- 
unique URF13 protein is highly specific; the same outcomes should be expected 
if two strains that produce the same amount of T-toxin attack the same host, T- 

25 corn. The most likely explanation for this result is that the fungal growth in 
planta has been inhibited by the host plant and the poor growth results in 
reduced T-toxin production which is normal when the fungus is grown in 
culture. Reduced virulence on T-cytoplasm corn is due to the reduced T-toxin 
production as that seen in leaky Tox mutants. This inhibition of growth could be 

30 due to the failure of suppression of the host defense mechanism by the fungus, 
which is mediated by the CPS1 controlled peptide toxin. A cpsV mutant that 



108 



WO 02/42444 



PCT/US01/43381 



fails to produce this "suppresser" could not be able to colonize plant tissues as 
vigorously as wild type does, resulting in the reduced ability to cause disease as 
indicated by the smaller lesion phenotype. If this turns out to be the case, CPS1 
should be considered as a general virulence factor as proposed for enniatin. 
5 It is possible that cpsl~ mutants are still be able to produce a certain 

amount of CPS1 toxin. One probability is the gene has not been completely 
inactivated by insertional mutagenesis or targeted disruption. The original REMI 
insertion occurred at core sequence 1 of CPS1A, a region that might be not 
critical (function of core 1 is unknown). The second targeted site is located 

10 between cores 1 and 2 of CPS IB and the third is located between cores 2 and 3 
of the same module. All three insertions do not disrupt critical motifs. On the 
other hand, CPS1 contains a number of in-frame start codons and some of them 
are located immediately downstream of these insertion sites. It is possible that 
each of these disruptions actually resulted in two subtranscripts, one is 

15 transcribed normally from the start codon of CPS1 and stops at the insertion site 
and second is transcribed near one of these in-frame ATGs downstream of the 
insertion site and stops at the end of CPS1. Both transcripts could give a 
truncated protein that still has enzymatic activities. But these separate enzymes 
might have affinities for their substrates lower than that of holoenzyme. The 

20 reduced production of CPS1 toxin might be due to the CPS1 holoenzyme having 
been split into two fractions by the vector insertion and the resulting truncated 
proteins being much less active than the original polypeptide. This hypothesis 
can be tested by construction a C. heterostrophus strain in which the entire CPS1 
encoding sequence has been deleted. 

25 The second possibility is the existence of multiple copies of CPS1 in the 

genome. Previous studies have demonstrated that the gene encoding HC-toxin 
synthetase (HTS1) is duplicated in the genome and both copies {HTS1-1 and 
HTS1-2) are 270 kb apart in most Tox2 + isolates of C. carbonum (Ahn and 
Walton, 1996, supra). Disruption of either copy reduced HTS1 activity but did 

30 not affect HC-toxin production; when both copies were disrupted, HC-toxin 

production was abolished (Panaccione et al, 1992, supra). But in contrast to the 
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case of HTS1, gel blot analysis does not indicate the presence of a second copy 
of CPS1 and disruption of CPS1 does affect the production of the putative toxin. 
It is unlikely that two genes with similar organization are in the genome. An 
alternative postulation is that there may be a second gene which encodes a 
5 protein with the same enzyme activity as CP SI but does not have significant 
sequence homology to CPS1. This hypothesis is hard to test unless this gene is 
clustered with CPS1 and can be recovered by chromosome walking. 

In conclusion, pathogenesis by C. heterostrophus to corn involves at least 
two secondary metabolites: the T-toxin, a host specific factor which determines 
10 high virulence on a particular host, T-corn and the hypothetical CP SI toxin, a 
general factor (either virulence or pathogenicity factor) which contributes to 
basic mechanisms underlying the disease establishment by the fungus in 
common host plants. 

15 Example 4 

CPS1 Orthologs 

As described above, Cochliobolus heterostrophus gene CPS1 encodes a 
putative peptide synthetase that appears to be a general factor for fungal 
virulence to its hosts. CPS1 has been found to be highly conserved among at 

20 least 9 fungal species belonging to 3 genera including the genus Cochliobolus 
and closely related genera Bioplaris and Setosphaeria; it has been demonstrated 
to be required for pathogenesis by three different plant pathogens, i.e., C. 
heterostrophus race O, race T to corn and C. victoriae to oats (Lu, 1998, Ph.D. 
thesis, Cornell University). 

25 To further explore the role of CPS1 in fungal pathogenesis and its 

conservation in other fungi, genomic DNAs of additional species of 
Cochliobolus and other closely or distantly related genera were probed with 
ChCPSl by DNA-DNA hybridization (Lu, S.-W., B.G. Turgeon and O.C. Yoder. 
1999. Fungal Genetics Conference, March 1999, Pacific Grove, California). 

30 Genomic DNAs of 40 field isolates (or lab strains) representing 34 fungal 

species belonging to 16 genera hybridized when probed with ChCPSl (Figure 4). 
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All 16 Cochliobolus species, including the known plant pathogens C carbonum, 
C victoriae, C. miyabeanus, C. sativus and C. specifer, and five genera closely 
related to Cochliobolus, i.e., Pyrenophora, Setosphaeria, Bipolaris, Stemphylium 
and Altemaria showed hybridization intensities comparable to that of C 
5 heterostrophus itself (Figure 4A). DNAs of species from nine distinctly related 
genera, including several of economic importance (e.g., Magnaporthe grisea, 
Fusarium graminearum, Gaeumannomyces graminis) or of medical importance 
(e.g., Candida albicans) hybridized weakly to CPS1 (Figures 4B and 4C) 
whereas no signal was detected in DNA of the basidiomycete Ustilago maydis. 

10 Homologs of CPS1 were further identified by polymerase chain reaction 

(PGR) using degenerate primers designed to conserved regions of C. 
heterostrophus CPS1 (ChCPSl). Four CPS1 homologs were cloned and 
characterized. Three of them were cloned from phytopathogenic fungi, including 
the wheat head scab fungus Fusarium graminearum (FgCPSl, 6003 bp, SEQ ID 

1 5 NO:40), the potato early blight fungus Alternaria solani, (AsCPSl, 2369 bp, 
SEQ ID NO:42) and the barley net blotch fungus Pyrenophora teres (PtCPSl, 
2320 bp, SEQ ID NO:44). The fourth was cloned from the human pathogenic 
fungus Coccidioides immitis (CiCPSl, 2435 bp SEQ ID NO:46). The complete 
FgCPSl gene was cloned using both PCR amplification and plasmid rescue 

20 procedures preceded by targeted gene disruption of this gene in the genome. The 
remaining three CPS1 homologs were partially cloned by direct PCR 
amplification. 

The FgCPSl open reading frame (5125 bp) has 50% nucleotide identity 
to ChCPSl in about 4.4 kbp of overlap. No "TATA" box-like element was 
25 found in the 5 f untranslated region, but other promoter sequences including two 
putative "CAAT" boxes and a "CT" motif were located upstream of the start 
codon (ATG). There is only one putative intron found 1508 bp upstream of the 
stop codon (TGA) in contrast to three in ChCPSl. 

A putative polyadenylation signal "AATAA" is located 62 bp 
30 downstream of the stop codon. The predicted FgCPSl protein (1692 amino 
acids, M r 187983 Da, SEQ ID NO:41) has 68% identity, 73% similarity to 
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ChCPSl in about a 1,500 amino acid overlap that contains two structurally 
similar modules highly similar to those of ChCPS 1 (Figure 7B). FgCPS 1 has no 
significant similarity to ChCPSl at the C-terminus, which is shorter and lacks 
the thioesterase domain seen in ChCPS 1 . 
5 AsCPSl (2369 bp, SEQ ID NO:42) has 76% nucleotide identity to 

ChCPSl in the entire cloned region which contains two conserved introns. The 
translated AsCPSl protein (partial) includes 758 amino acids (SEQ ID NO:43) 
corresponding to amino acids 511-1269 in ChCPSl and has up to 93% identity, 
95% similarity to ChCPSl (Figure 7B). 

10 PtCPSl (2320 bp, SEQ ID NO:44) has 78% nucleotide identity to 

ChCPSl in the entire cloned region which contains only one intron. The 
translated PtCPSl protein (partial) includes 758 amino acids (SEQ ID NO:45) 
corresponding to amino acids 51 1-1269 in ChCPSl and has 93% identity, 96% 
similarity to ChCPS 1 . 

15 CiCPSl (2435 bp, SEQ ID NO:46) has 65% nucleotide identity to 

ChCPSl in the entire cloned region which has no introns. The translated 
CiCPSl protein (partial) includes 812 amino acids (SEQ ID NO:47) 
corresponding to amino acids 51 1-1040 in ChCPSl and has 67% identity, 80% 
similarity to ChCPSl (Figure 7B). Another ortholog in Candida was identified 

20 by Southern blot (see Figure 4). 

BLAST searches using SEQ ID NO:41 (Figure 6) and SEQ ID NO:47 
(Figure 7A) identified orthologs of those fungal CPS Is. 

Disruption of FsCPSl in F. graminearum (= Gibberella zeae), the wheat 
head scab fungus, caused significantly reduced virulence to wheat. All cps 1" 

25 disruptants of F. graminearum showed at least 50% (when inoculated with 
10 5 /ml condidia) or even 80-90% (when inoculated with 10 4 /ml condidia) 
reduction in ability to cause a typical "white head" symptom on the host whereas 
in the same conditions, ectopic transformants caused disease symptoms 
indistinguishable from wild type. These results suggest that CPS1 is also 

30 required for pathogenesis by fungi that are distantly related to C heterostrophus, 



112 



WO 02/42444 



PCT/US01/43381 



arguing that these peptide synthetase gene homologs might control biosynthesis 

of a general fungal virulence factor. 

Discussion 

Conservation of CP SI and taxonomy . By genomic DNA hybridization, 
5 C. heterostrophus CPS1 homologs were found in 16 additional fungal species 
belonging to 5 genera. Hybridization signals for some were as strong as the C 
heterostrophus gene, indicating that CPS1 is highly conserved among these 
fungi. This conservation appears to match the taxonomic relationships between 
these species. Cochliobolus (anamorph Bipolaris ) and Setosphaeria (anamorph 

10 Exserohilum) are closely related genera. 

Two species, C. victoriae and C. carbonum, which are able to cross to 
each other and thus may not be different species (Scheffer et al., 1967; Yoder et 
al., 1989), showed the same hybridization pattern to CPSL B. saccharic the 
closest asexual relative of C. heterostrophus, hybridized to two Hin&lll 

1 5 fragments that were only seen in G heterostrophus itself, but all other species 
gave only one distinct polymorphic band. Phylogenetic analyses using the 
internal transcribed spacer (ITS) sequences and fragments of the GPD (vanWert 
and Yoder, 1992) and MAT genes (Turgeon et al., 1993, supra) also put C 
victoriae/ C carbonum and G heterostrophus/ B. sacchari closest to each other 

20 (Turgeon and Berbee, 1997). These results might imply that CPS1 has co- 
evolved with these genes. 

CPS1 homologs and pathogenesis. The genera Cochliobolus and 
Setosphaeria include many plant pathogenic species that are commonly 
associated with leaf spots or blights, mainly on cultivated cereals and wild 

25 grasses (Sivanesan, 1987; Alcorn, 1988). This group of phytopathogenic fungi 
includes both mild pathogens and severe pathogens that often produce host- 
specific toxins (Yoder, 1980, supra). One of the essential questions is whether 
or not the various diseases on diverse host plants caused by these fungi involve 
common factors or depend only on individual specific factors, such as host- 

30 specific toxins. 
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Previous studies have shown that host-specific toxins can be critical 
factors for determining either virulence or host-range, but they do not account for 
general pathogenicity since they are produced only by certain isolates in the 
species and the corresponding biosynthetic genes are found only in these toxin- 
5 producing isolates (Y oder et al., 1997, supra). In contrast, CPS1 homologs are 
found in all Cochliobolus and Setosphaeria species tested so far, suggesting they 
are a common factor shared by this group. Disruption of the CPS1 homolog in 
the oat pathogen C. victoriae caused dramatically reduced virulence to victorin- 
susceptible oats although the transformants produced wild type levels of victorin. 

10 This result is similar to that with C. heterostrophus race T, in which cpsT 
disruptants still produced wild type levels of T-toxin but showed reduced 
virulence on T-cytoplasm corn. These results argue strongly that host-specific 
toxins alone are not sufficient in determining the ultimate outcome of 
fungus/plant interactions and suggest that the establishment of disease by these 

1 5 fungi also requires CPS 1 , which might control a pathway for general 
pathogenicity. 

The CPS1 gene cluster and homologs could be fungal "pathogenicity 
islands" . In the early 1990s, studies on pathogenesis by uropathogenic E. coli 
led to the identification of pathogenicity gene clusters, termed "pathogenicity 

20 islands" (Hecker et al., 1990; Blum et al., 1994). Subsequently, similar gene 
clusters were identified in additional animal or human bacterial pathogens, 
including Yersinia pestis, Helicobacter pylori and Salmonella typhimurium. 
These islands often contain genes for production of toxins or genes encoding 
proteins that are capable of interacting with host defense factors or required for 

25 type III secretion systems that deliver virulence proteins into host cells. Usually, 
they are found only in pathogenic strains (or species); in rare cases, they occur in 
nonpathogenic strains of the same species or related species (Hacker et al., 1997, 
supra). 

In phytopathogenic bacteria, hrp gene clusters have been referred to as 
30 "pathogenicity islands" because they have several features in common with 

"pathogenicity islands" in animal pathogenic bacteria, i.e., they are found only in 
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pathogenic species (required for plant pathogenicity) and contain highly 
conserved genes {hrc genes) defining the type III protein secretion system 
(Alfano and Collmer, 1996; Barinaga, 1996). 

In plant pathogenic fungi, genes or gene clusters with characteristics of 
5 "pathogenicity islands" have been identified from certain species, i.e., in Nectria 
haematococca, the PDA genes for detoxifying the pea phytoalexin and other pea 
pathogenicity genes (PEP) are located on dispensable chromosomes that are 
found in all isolates pathogenic to pea but usually absent in all nonpathogenic 
isolates (VanEtten et al., 1994; Liu et al., 1997, supra). In the genus 

10 Cochliobolus, the Tox2 gene cluster controlling the biosynthesis of HC-toxin is 
found only in C. carbonum race 1 (pathogenic to hmlhml corn) and the Toxl 
genes controlling T-toxin production are found only in C. heterostrophus race T 
(highly virulent on T-cytoplasm corn); all other races of the same species and all 
other fungal species tested so far lack these Tox genes (Ahn and Walton, 1996, 

15 supra; Yang et al., 1996, supra; Yoder et al., 1997, supra). 

CPS1 differs in two important ways compared to these fungal 
"pathogenicity islands". First, it is highly conserved among several 
phytopathogenic Cochliobolus species and relatives. Second, like certain 
bacterial "pathogenicity islands", CPS1 also has homologs in "nonpathogenic" 

20 species. C. homomorphus and C. dactyloctenii, neither of which causes disease 
on plants, hybridized strongly to CPSL This may reflect genetic changes in the 
"pathogenicity island" that resulted in loss of pathogenicity. In the bacterial 
genus Listeria, which includes several human or animal pathogenic species 
harboring highly conserved "pathogenicity islands", the "pathogenicity island" 

25 homolog in the nonpathogenic species (L. seeligeri) was found to be 'silent' due 
to a mutation that occurred in the promoter region of a critical regulatory gene in 
the cluster (Hacker et al., 1997, supra). These features suggest that the CPS1 
gene cluster and homologs could define a new group of fungal "pathogenicity 
islands". 

30 The origin of CPS1 . It is known that the evolution of pathogenicity 

involves two major processes. A pathogenic microorganism could originate 
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from nonpathogenic progenitors by slow modifications (such as point mutations 
and genetic recombination) of genes that were adapted for parasitic growth on 
hosts or by the integration of large fragments of "alien" DNA into the genome 
that enable the recipient to attack particular hosts (gene horizontal transfer). The 
5 latter can occur in the recent or distant evolutionary past. Subsequent vertical 
transmission in the lineage (if the transferred gene is stable in the recipient 
genome) would result in the preserve of the gene in all species that diverged after 
the acquisition of the gene(s) (Scheffer, 1991; Arber, 1993; Krishnapillai, 1996; 
Burdon and Silk, 1997). 

10 In the past few years, substantial evidence has become available that 

supports the hypothesis of gene horizontal transfer. All "pathogenicity islands" 
in animal pathogenic bacteria are believed to have been acquired by a horizontal 
transfer event (recent or past) because they usually differ in G+C content from 
the recipient genome and have transposable elements at the boundaries of the 

15 gene clusters (Hacker et al., 1997, supra). The hrp "pathogenicity islands" do 
not show a significant difference in G+C content or association with 
transposable elements, but they are also believed to have arisen similarly because 
hrc genes in these "pathogenicity islands" show high similarity to genes defining 
the type III protein secretion system found in animal pathogenic bacteria as 

20 mentioned above (Alfano and Collmer, 1996; and Barinaga, 1996). 

Although CPS1 itself has several typical fungal introns and a G+C 
content (51.5%) similar to most known fungal genes, genomic regions (about 1.5 
kb) flanking the gene have higher G+C content (>60%). Several short G+C-rich 
regions are also found in the gene cluster; one of the open reading frames 

25 (ORF10) has a 63.6 % G+C content. Compared to those filamentous fungal 
genomes characterized so far, including N. crassa, A. nidulans^ U. maydis (all 
have G+C content 51-54%, see Karlin and Mrazek, 1997, supra), the genomic 
region around CP SI is unusual. This might suggest that the gene cluster 
harboring CPS1 came from a bacterial source (since most bacterial genes are 

30 known to have a high G+C content), but has evolved into a fungal version. 
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Based on these data, CPS1 homologs may have a common ancestral gene 
which was acquired from a bacterial species via horizontal transfer and then 
maintained by the fungal genome via vertical transmission in closely related 
lineages. 

5 In the evolution process, the genus Cochliobolus could also have 

inherited a second gene (X) controlling the ability to take up foreign DNA, by 
which its ancestor took the "alien" CPSL As a result, this group of fungi is able 
to keep trapping genes from other organisms by additional "horizontal transfers" 
and giving rise to new races or even new species characterized by the ability to 

1 0 produce unique pathogenesis factors. The direct support for this hypothesis is 

that both the Tox2 locus of C. carbonum and the Toxl locus of C heterostrophus 
are associated with large fragments of "alien" DNA (A+T-rich and highly 
repeated) and the same could also be true for ToxS controlling victorin 
production by C. victoriae, although there is yet no direct experimental evidence 

15 (Ahn and Walton, 1996, supra; Yang et al., 1996, supra; Yoder et al., 1997, 
supra). In contrast to CPS1, these gene transfers must have occurred in the 
recent evolutionary past because both Toxl and Tox2 loci are found only in 
specific isolates in the species, e.g., the acquisition of Toxl genes probably 
occurred as recently as the 1960s when race T was first identified in the field 

20 (Yoder et al., 1997, supra). 

There are other possibilities for the evolution of CPSL First, each genus 
mentioned above could have acquired CPS1 independently after divergence of 
the lineage. But this seems less likely because this would need to happen at the 
same time and involve the same donor organism if the fact that the homologs 

25 detected in Cochliobolus and Setosphaeria gave similar hybridization signal 
intensity is considered. Second, the horizontal transfer of CPS1 could have 
occurred at earlier time periods such as before the divergence of Pleosporales or 
even the Ascomycotina. To test these hypotheses, detection of CPS1 homologs 
in Pyrenophora, Pleospora and other genera must be done by either genomic 

30 DNA hybridization or PCR. Based on the facts discussed here, it is not 

unreasonable to predict that additional CPS1 homologs will be found in other 
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fungal species. Further investigation could provide a direct entry point for 
understanding the evolution of fungal pathogenesis to plants. 

Example 5 

5 Other Genes Near Cochliobolus CPS1 

Materials and Methods 

Construction of genomic library of C. heterostrophus. The cosmid 
SuperCosPl-1 1 (kindly provided by Dr. Thomas Hohn of Mycotoxin Research 
Unit USDA/ARS), which is a modification of the cosmid vector cosHygl 

10 (Turgeon et aL, 1993, supra), was used for library construction. Genomic DNA 
of strain C4 (Tox + ; MAT- 2) was prepared as previously described (Yoder, 1988, 
supra) and purified by the equilibrium centrifogation in CsCl-ethidium bromide 
gradients (Sambrook, et aL, 1989, supra). Three i g of genomic DNA was 
partially digested with Mbol using a test series of enzyme dilutions (1.5 X 10" 4 - 

15 1.25 units, New England Biolabs, Beverly, MA) at 37°C for 0.5 hour. DNA from 
the digestions which yielded fragments with an average size of 30 kb was pooled 
and then dephosphorylated with Calf Intestinal Alkaline Phosphatase (CIAP, 
GIBCO BRL Products, Gaithersburg, MD). Two i g of CIAP-treated DNA was 
ligated into the BamHI site of the cosmid vector that had been digested with Xbal 

20 and treated with CIAP. Aliquots of the ligated molecules were packaged using 
Gigapack II Packaging Extract (Stratagene, La Jolla, CA) according to the 
manufacturer's recommendations. E. coli strain NM554 was transfected with the 
packaged phage particles and selected for ampicillin resistance. Approximately 
1.6 X 10 5 independent ampicillin resistant colonies were obtained from two 

25 experiments. Cosmid DNAs were made from 16 colonies and digested with 
Hindlll and EcoKL respectively to confirm random insertions. Colonies were 
scraped from each of the original LB plus ampicillin plates and stored at -70 °C 
in 25% glycerol (one plate of colonies/per tube). 

Screening of the cosmid library . A mixture of cosmid clones from 23 

30 stored tubes was diluted to 10" 4 spread on ten LB plus ampicillin plates (150 X 
15 mm) and incubated at 37°C overnight. Colonies (total about 1.2 X 10 4 ) were 
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transferred to Colony/Plaque Screen Hybridization Transfer Membrane (137 
Mm discs, NEN™ Life Science Products, Boston, MA) and incubated at 37 °C 
for 8 hours. Three replicates were made of each plate (one as master filter and 
two for probing). For hybridization, filters carrying colonies were lysed in 0.5 N 
5 NaOH, 1 .5 M NaCl for 5 minutes, neutralized twice in 1 M Tris pH 7.4, 1 .5 M 
NaCl for 5 minutes followed by 2 X SSC for 2 minutes. Filters were air dried 30 
minutes then baked in a vacuum oven at 80 °C for 1 hour. Duplicate filters were 
probed with 32 P labeled 3.4 and 3.2 kb fragments of the CPS1 gene (cloned on 
p214B7 and p214Sl, respectively) that were prepared by restriction enzyme 

1 0 digestion and purification using QIAquick Gel Extraction Kit (QIAGEN Inc., 
Chatsworth, CA). Hybridization was in 6 X SSC, 1 X BLOTTO (Sambrook et 
al., 1989) at 65 °C overnight. Then filters were then washed twice for 15 
minutes, 65°C in 2 X SSC, 0.1% SDS. Cosmid clones corresponding to positive 
areas were transferred from the master filters into a 96-well microtiter plate 

15 (Coming Costar, Cambridge, MA) and allowed to grow at 37°C overnight. Cells 
were then transferred onto membranes using a frogger, incubated and processed 
same as above. Positive clones were purified and re-tested by hybridization with 
the same probes as mentioned above. The isolated cosmid clones were mapped 
by probing cosmid DNA digested with several enzymes with the labeled 3.4 and 

20 3.2 kb CPS1 fragments separately. 

DNA manipulations and sequencing . Cosmid DNA was prepared using 
standard protocols (Sambrook, et al., 1989, supra). Restriction enzyme 
digestions, gel electrophoresis, gel blot analysis, primer design, DNA sequencing 
and sequence analysis were done as described above. To facilitate sequencing, 

25 three deletion constructs were made by digestion of the original cosmid clones 
(Table 10) with restriction enzymes that do not cut the cosmid vector, followed 
by religation (Table 10). Sequencing of each cosmid clone was initiated with 
vector-specific and CPS1 (or ri£S7)-specific primers. Subsequently, sequences 
were extended by designing new primers to the previously sequenced region 

30 (Table 11). 
Results 
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Characterization of two overlapping cosmid clones . Two cosmid clones, 
C4L6582 and C4L7296, were isolated by screening the library (Table 10). Gel 
blot analysis indicated that both cosmid clones span the vector insertion site in 
the REMI mutant and contain the cloned CPS1 and TES1 sequences described 
5 above. Sequence obtained using a primer to the region immediately flanking the 
insertion site is the same as that in the tagged DNA recovered from the REMI 
mutant, confirming that no deletions or chromosome rearrangements occurred at 
the tagged site. Two cosmids overlap each other in a 27.9 kb region. C4L7296 
(37.2 kb) carries a 30.9 kb genomic insert which hybridized to both 3.4 kb and 

10 3 .2 kb CPS1 fragments. Restriction mapping and sequencing confirmed that this 
insert contains the entire TES1 sequence and most of the CPS1 sequence (4.4 out 
of 5.4 kb). C4L6582 (37.7 kb) carries a 31.4 kb insert that also includes the 
entire TES1 sequence but only 1.1 kb of the N-terminal encoding sequence of 
CPSL Both inserts lack the C-terminal region of CPS1; their 3' end is ligated to 

1 5 the T3 end of cloning site in SuperCosP 1-11. Attempts to sequence using the T7 
primer were unsuccessful, presumably because the T7 end, which is close to one 
of the cos sites on SuperCosPl-1 1 was disrupted during the packaging process. 
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Table 10 . Cosmid and plasmid clones used in this study 



Clones Length Characteristics Reference 

(kb) 



5 


Super- 


6.9 


Cosmid vector for library construction 


Horwitz et 




ai. ? i^OSr 1- 


1 1 


containing tne z.d kd jozwaiii-ocz/i 
iidgiiiciiL iruin pjcx i o Carrying riygjj geiic 
fused to C. heterostrophus promoter 1. 


1 QQ'7 




pUCAlJrrlJN 


/I 

4.o 


Cloning vector derived from pUCATPH. 


inis stuuy 


10 


C4L6582 


37.7 


A cosmid clone with a 3 1 .4 kb insert 

loOldLCU. ±10111 aUl CClllllg L11C nuiaiy. 

Includes 4.0 kb region p214B7. 


This study 




C4L7296 


37.2 


A cosmid clone with a 30 9 kb insert 
isolated from screening the library. 


This studv 


15 






Includes 6.3 kb region p214B7+p214Sl. 






p6582dH 


10.9 


A deletion (28.8 kb) construct derived 
from digestion of C4L6582 with iMIII. 


This study 




p6582dS 


21.1 


A deletion (16.6 kb) construct derived 

llOlTl CllgCbLlOli Ol v^'+ijOjOZ Willi DuCl. 


This study 


20 


P 7296dX 


9.0 


A deletion (28.2 kb) construct derived 
from digestion of C4L7296 with^ol. 


This study 




pDXPS* 


13.6 


Ligation of 7296dX digested withJM 
to the fe/I-digested pUCATPHN. 


This study 




pDXPSH* 


6.5 


A plasmid derived from pDXPS by Hindlll 


This study 


25 






digestion and religation of a 6.5 kb Hindlll 
fragment containing the entire pUCATPHN 
sequence flanked by 1 .2 kb of the 5 ? end of 










CPS1 and 0.5 kb 3 T end of C4L7296 sequence 



30 

* Designed for deletion of the 28.2 kb of genomic region (= deleted from 
p7296dX, including 3.6 kb CPS1 N-terminal encoding sequence) but 
transformation of wild type was unsuccessful. 
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Table 1 1 . Primers used for sequencing genomic DNA on C4L7296 and 
C4L6582 


5 


Name a 
F-I 

214RP7 


Position 15 


Sequence 0 


Template d 


Origin 






SEQIDNO:148 


A 


p214B7 




1.RP8 


4940 


SEQIDNO:149 


A 


7296RP 




2. RP9 


592 


SEQIDNO:150 


A 


7296RP8 


10 


3.RP10 


4124 


SEQIDNO:151 


A 


7296RP9 




4.RP11 


3790 


SEQIDNO:152 


A 


7296RP10 




5.RP12 


3424 


SEQIDNO:153 


A 


7296RP11 




6. RP13 


2970 


SEQIDNO:154 


A 


7296RP12 




7.RP14 


2362 


SEQIDNO:155 


A 


7296RP13 


15 


8.RP15 


1764 


SEQIDNO:156 


A 


7296RP14 




9. RP16 


1169 


SEQIDNO:157 


A 


7296RP15 




10.RP17 


647 


SEQIDNO:158 


A 


7296RP16 




F-n 












214RP2 




SEQIDNO:159 


B 


p214B7 


20 


11. SRP1 


3095 


SEQIDNO:160 


A 


6582dSRP2 




12. SRP2 


2755 


SEQIDNO:161 


A 


7296dSRPl 




13. SRP3 


2366 


SEQIDNO:162 


A 


7296dSRP2 




14. SRP4 


2008 


SEQIDNO:163 


A 


7296dSRP3 




15. SRP5 


1555 


SEQIDNO:164 


A 


7296dSRP4 


25 


16. SRP6 


1187 


SEQIDNO:165 


A 


7296dSRP5 




17. SRP7 


647 


SEQIDNO:166 


A 


7296dSRP6 




18. SFP1 


3321 


SEQIDNO:167 


A 


6582dSRP2 




19. SFP2 


3660 


SEQIDNO:168 


A 


7296dSFPl 




20. SFP3 


3969 


SEQIDNO:l 69 


A 


7296dSFP2 


30 


21. SFP4 


4345 


SEQIDNO:170 


A 


7296dSFP3 




22. SFP5 


4724 


SEQIDNO:171 


A 


7296dsFP4 




23. SFP6 


5137 


SEQIDNO:172 


A 


7296dSFP5 




24. SFP7 


694 


SEQIDNO:173 


A 


7296dSFP6 
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F-III 










TrpC 




SEO ID NO* 174 


c 


nUCATPH 


214FP6 




SEO ID NO- 175 






25. CFP1 


463 


SEO ID NO- 176 


A 




26 CFP2 


903 


SEO ID NO- 177 


A 


/ wjj 1 JT JL 


^ / . v^-JL JT J 


1 ^34 


OJC/V^ JUL* 1\VJ. 1 / O 


A 


/ZyopUv^rr Z 


28. CFP4 


1910 


SEO ID NO* 179 


A 




29. CFP5 


2491 


SEQ ID NO: 180 


A 


7296pUCFP4 


r-lv 










214B7RP5 




SEQIDNO:181 


E 


p214B7 


30. HRP1 


592 


SEQ ID NO: 182 


F 


6582dHPvP5 


31.HFP1 


763 


SEQ ID NO: 183 


F 


6582dHRP5 



a «pjp» indicates reverse primer; "FP" indicates forward primer. Primers 
1 5 designed to genomic DNA on the cosmid clones are numbered in order. 

Primers 1-10 are preceded by "7296"; 1 1-24 by "7296d"; 25-29 by "7296pU" 
and 30-31 by "6582d". 
Primer position corresponds to position in the genomic sequences of each 
fragment. 

20 c Each primer sequence is given in the 5 ? to 3' direction. 

d Cosmids or plasmids used for sequencing reactions. A = C4L7296; B = 
6582dS; C = pDXPS; D = pDXPSH; E = 6582dH; F - C4L6582. 

Sequencing of C4L7296. A total of 27.4 kb additional genomic sequence 
25 5' of TES1 was cloned. Four fragments with totaling 16.9 kb (60%) were 
sequenced, three of which were sequenced using C4L7296 as template. 
Sequencing of Fragment I (F-I, 5.3 kb) began with primer 21 4B7RP7 (which 
matches the 5 ! end of TES1), then was followed by sequencing with primers 
designed to previously determined sequences. Fragment II (F-II, 6.9 kb) was 
30 started using primers to sequences flanking the Sacl site previously determined 
by sequencing the deletion construct 6582dS (see Table 10) and subsequently 
extended in both directions. Sequence of Fragment III (F-III, 3.2 kb) was 
obtained in a complicated manner as part of the attempt to create a deletion 
construct for transformation. The first part of the sequence was obtained from 
35 the clone pDXPS derived from deletion construct 7296dX (Table 10) using the 
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TrpC primer and the sequence was extended to the 3 T end using C4L7296 as 
template. A 200 bp region at the 5' end of Fill was obtained from a pDXPS 
derived clone, pDXPSH (Table 10), using a CPS7-specific primer 214S1FP6. 

Sequencing of C4L6582. This clone contains 2.8 kb additional genomic 
5 DNA extending into the region to the left end of C4L7296. The deletion clone 
6582dH (Table 10) was used to initiate sequencing of Fragment IV (F-IV, 1.5 
kb) using a TES1 -specific primer 214B7RP5 followed by one step of sequence 
extension in both 3' and 5' direction on C4L6582. 

Identification of open reading frames in the sequenced region Eleven 

10 open reading frames (ORF) were identified in the four sequenced fragments 

(Table 12). These ORFs are all relatively small (0.3-2.3 kb). Five ORFs contain 
putative introns with typical fungal characteristics (Table 13). ORF 12, ORF 10, 
ORF 14, ORFS and ORF8 are transcribed in one direction; others are transcribed 
in the opposite direction. ORF6 and ORF7 (in F-II) overlap and are transcribed 

15 in the same direction. ORF 14 and ORF9 (in F-III), ORF3 and ORF8 (in F-I) also 
overlap but are transcribed to the opposite directions. Most ORFs have G+C 
content between 50-55% in the normal range for most fungal genes with the two 
exceptions: ORF (0.3 kb) in the 5' end of F-III has a G+C content of 63.6%; 
ORF14 (0.7 kb, located 1.0 kb downstream of ORF10) has a G+C content 

20 56.9%. Both ORFs are located in a G+C-rich (about 58.0%) region in F-III 
(positions 300-800 and 1240-2040, respectively). 

Database searches suggested that three ORFs (ORF3, ORF7 and ORF1 1) 
as well as CPS1 and TES1 encode homologs of known proteins (see below) and 
others encode, if anything, proteins with unknown functions (Table 12). ORF 17 

25 (SEQ ID NO:48) encodes an iron reductase (SEQ ID NO:49) and ORF15 (SEQ 
ID NO:55) encodes a permease/MFS transporter (SEQ ID NO:56). Figure 9A 
shows the results of a BLAST search with SEQ ID NO:49 and Figure 10 shows 
the results of a BLAST search with the polypeptide encoded by SEQ ID NO:55. 
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Table 12 . Open reading frames (ORFs) identified in sequenced genomic regions 
of C4L7296 and C4L6582 



Region 3 


ORF 


Size (kb) 


No. of 


G+C 


(%) Putative 








introns 




Function 


F-r 


ORFl d 


5.4 


3 


51.5 


Peptide synthetase 


F-I* 


ORF2 d 


1.1 


1 


55.5 


Thioesterase 


F-I 


ORF3 


1.8 


3 


50.0 


DNA-binding 


F-I 


ORF8 


0.5 


0 


55.2 


unknown 


F-I 


ORF11 


1.9 


0 


52.6 


CoA transferase 


T? TT 

F-II 


ORF5 


2.3 


1 


54.1 


unknown 


F-II 


ORF6 


0.5 


0 


51.6 


unknown 


F-n 


ORF7 


1.7 


1 


52.0 


Decarboxylase 


F-III 


ORF9 


0.7 


0 


54.2 


unknown 


F-III 


ORF10 


0.3 


0 


63.6 


unknown 


F-III 


ORF13 


0.8 


1 


53.6 


unknown 


F-III 


ORF14 


0.7 


0 


56.9 


unknown 


F-IV 


ORF12 


1.2 


1 


49.2 


unknown 



a F-I'= Genomic DNA 

b The positions of ORF3-ORF14 and 17 in the sequenced fragment is indicated; 
ORFs corresponding to known proteins are underlined. 
c The characteristics of putative introns are given in Table 12. 
25 d Characterized as CPS1 and TES1 



Table 13 . Characteristics of putative introns in ORFs identified in sequenced 
genomic regions on cosmids C4L7296 and C4L6582 



40 



ORF Intron 


Size Location 3 
(bp) 


5'Border 


3 'Border 


Branch, site 


ORF3 I 


64 FI 5094-5031 


GTACGT 


TAG 


CGCTGAC 


n 


46 FI 5006-4961 


GTGAGT 


TAG 


AGCTAAG 


m 


46 FI 4477-4432 


GTACGT 


CAG 


AGCTGAC 


ORF5 I 


48 FH 3477-3524 


GTATGT 


TAG 


TGCTAAC 


ORF7 I 


1 14 FH 2307-2194 


GTGTGC 


CAG 


ATCTAAC 


ORF13 I 


51 FHI 2742-2692 


GTGCGT 


CAG 


TACTGAT 


ORF12 I 


47FIV 1007-1053 


GTAAGT 


TAG 


GATTGAC 


Consensus 




GT^qYGT 


T / C AG 


NRCTAAC" 



Number of the fragment followed by the position of the first and last 
nucleotide of the intron with respect to the total sequence. 
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b Y = Pyrimidine (T or C); R = purine; N = purine or pyrimidine. 
Discussion 

Two cosmids define a large gene cluster . The C heterostrophus CPS1 
5 gene was cloned by identification of genomic DNA fragments recovered from 
the tagged site in a mutant generated using REMI insertional mutagenesis. 
Characterization of two overlapping cosmid clones in this study has proved that 
no deletions or chromosome rearrangements are associated with the gene tagging 
event, because both cosmids carry the same fragment which span the REMI 

1 0 insertion site and the nucleotide sequence in this region is the same as that of 
recovered genomic DNA from the tagged site. This undoubtedly clarifies the 
identity of CPS1, which is the major biosynthetic gene. Mapping and 
sequencing of the two cosmids extended the sequence by 27.4 kb from the 
previously cloned fragment, leading to the characterization of 38.7 kb of 

1 5 contiguous genomic DNA, the largest genomic region analyzed so far in C 

heterostrophus. hi addition to CPS1 and TES1, sequence analysis of this region 
revealed at least 1 1 open reading frames; three of them, designated as DBZ1, 
CAT1 and DEC2, respectively, apparently encode functional proteins (Table 13). 
The tight linkage of these genes suggests that they may be involved in the same 

20 pathway. 

In filamentous fungi, in some cases, genes in pathways for biosynthesis 
of secondary metabolites are dispersed on different chromosomes, e.g., the 
cephalosporin C pathway genes in Acremonium chrysogenum (Mathison et al., 
1993, supra) and the melanin pathway genes in Colletotrichum lagenarium 

25 (Kubo et al. ? 1996, supra). In other cases, tightly linked genes are usually found 
to be functionally related to a common pathway. This clustering organization 
has been exemplified by the sterigmatocystin pathway genes of Aspergillus 
nidulans, in which 25 coordinately regulated transcripts are found in a 60 kb 
genomic region (Brown et al., 1996) and the trichothecene pathway genes of 

30 Fusarium sporotrichioides, in which 9 genes are clustered in a 25 kb region and 
8 of them have been shown to be required for the pathway function (Hohn et al., 
1995). The genes involved in biosynthesis of certain fungal peptides are also 
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found as clusters. The tight linkage between CPS1 and these additional genes 
might reveal the presence of a novel secondary metabolite pathway in C. 
heterostrophus. In this pathway, CPS1 is the major structural gene since it 
encodes a large multifunctional enzyme with all catalytic activities required for 
5 synthesis of a secondary metabolite, presumably a peptide phytotoxin; other 
genes may carry out different functions required for coordinate operation of the 
pathway, such as regulation, posttranslational modification or substrate 
processing as discussed below. 

Significance of the CPS1 gene cluster . Both functional and structural 

10 analyses strongly support the hypothesis that the CPS1 gene cluster controls a 
novel biosynthetic pathway. Pathway genes have been studied only in a few 
filamentous fungi mainly for industrial purposes (Keller et al., 1997, supra). For 
plant pathogenic fungi, little is known about pathway genes for fungal 
pathogenesis. In C. heterostrophus, recent cloning of two Toxl genes PKS1 

15 (Y ang et al., 1996, supra) and DEC1 (Rose et al., 1996, supra) have contributed 
to a breakthrough in understanding the molecular mechanism for biosynthesis of 
T-toxin, a virulence determinant in the fungus/corn interaction. But further 
identification of related pathway genes has been unsuccessful because the two 
genes are located on different chromosomes and each is embedded in A+T-rich 

20 DNA (Y oder et al., 1997, supra). In contrast, the CPS1 cluster provides a good 
opportunity to explore a pathogenesis pathway. 

First, it resides in a "normal" sequence region. G+C content of a 50-55% 
is found in most of the cloned sequences and no A+T-rich DNA is associated 
with either end of the cloned region. This would facilitate cloning of additional 

25 pathway genes by further chromosome walking, by screening of cosmid libraries 
or the targeted integration and plasmid rescue. Second, it contains a regulatory 
gene (DBZ1) which is presumably linked to a signal transduction pathway. 
Isolation of genes that interact with DBZ1 could reveal novel factors mediating 
the molecular communication between fungal pathogen and the host plant. 

30 Further characterization of DBZ1 (along with position-specific disruption or 
deletion) would be also helpful in determining the limit of the gene cluster, 
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because tightly linked genes involved in a common pathway are often 
coordinately regulated by the same regulatory factor (Keller et al., 1997, supra). 
Finally, CPS1 genes are found in both race T and race O, and its homologs are 
also found in other Cochliobolus species. Presence of high G+C content may 
5 imply that these genes evolved from a bacterial ancestor and the conservation in 
these fungi may correlate with the phytopathogenic function of the gene products 
encoded by the CPS1 cluster. Further investigation of this cluster should provide 
insights into the evolution of general pathogenicity factors among this group of 
fungi. 

10 ORF17 is an iron reductase (SEP ID NO:49^ and ORF15 is a 

permease/MFS transporter (SEP ID NO:56V Ferric reductases are a group of 
enzymes found in bacteria, fungi, plants and animals that are responsible for 
reduction of ferric iron to ferrous iron, an absorptive form used by the organism. 
They have been well studied in S. cervisiae, C. albicans and K capsulatum and 

1 5 the like. The yeast FER1 has been expressed in tobacco (Pki et al., 1999). 

Previous studies have shown that FER genes could be important 
pathogenic determinants. Timmerman and Woods have proposed that in H. 
capsulatum FER could play critical roles in the acquisition of iron in three 
different ways: from inorganic or organic ferric salts, from host Fe(III) binding 

20 proteins (transferrin and the like), and from siderophores produced by the fungus 
itself (to reduce and release the iron chelated by the siderophore molecules). 

On the other hand, iron sequestration in response to microbial infection 
has been demonstrated to be a host defense mechanism. The infection-related 
iron acquisition system in the pathogen can be considered to be an important 

25 mechanism against host defense and for a successful colonization by the 

pathogen in the host cells. This could be a general mechanism for all pathogenic 
fungi. 

CP SI may encode an enzyme which is responsible for biosynthesis of a 
novel siderophore with unusual amino acid, hydroxyl acid and architecture. The 
30 CPS1 siderophore can compete with the host for iron acquisition when the 

fungus enters its host cells where the iron is limited due to host sequestration. In 
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particular, for root pathogens such as C. victoriae, sequestration may be stronger 
in the root surface. This could explain why the cpsl mutant showed drastically 
reduced virulence. The FER1 could be required to release iron from the CPS 1 
siderophore which explains its location near the CPS1 gene. Moreover, fungal 
5 strains could be cultured in iron-limiting conditions because CPS1, and likely 
other genes in the cluster maybe turned on only during conditions of iron 
depletion. 



All publications, patents and patent applications are incorporated herein 
1 0 by reference. While in the foregoing specification, this invention has been 

described in relation to certain preferred embodiments thereof, and many details 
have been set forth for purposes of illustration, it will be apparent to those skilled 
in the art that the invention is susceptible to additional embodiments and that 
certain of the details herein may be varied considerably without departing from 
1 5 the basic principles of the invention. 
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WHAT IS CLAIMED IS : 

1 . An isolated polynucleotide comprising a fungal nucleic acid segment 
which encodes a polypeptide which is substantially similar to a 

5 polypeptide encoded by a nucleic acid sequence comprising an open 

reading frame comprising SEQ ID NO:46, SEQ ID NO:48, or SEQ ID 
NO: 5 5, or the complement thereof. 

2. An isolated polynucleotide comprising a fungal nucleic acid segment 
10 which is substantially similar to a nucleic acid sequence comprising an 

open reading frame comprising SEQ ID NO:46, SEQ ID NO:48 5 SEQ ID 
NO: 55, or the complement thereof. 

3. An isolated polynucleotide comprising a fungal nucleic acid segment 
1 5 which hybridizes under stringent hybridization conditions to SEQ ID 

NO:46, SEQ ID NO:48, SEQ ID NO:55, or the complement thereof 

4. The isolated polynucleotide of claim 1, 2 or 3 which consists of SEQ ID 
NO:46, SEQ ID NO:48 or SEQ ID NO:55 of the complement thereof 

20 

5. The isolated polynucleotide of claim 1 ? 2 or 3 wherein the nucleic acid 
segment is from As corny cota. 

6. The isolated polynucleotide of claim 1, 2 or 3 wherein the nucleic acid 
25 segment is from a pathogenic fungus. 

7. The isolated polynucleotide of claim 1 wherein the nucleic acid segment 
encodes a polypeptide having at least 80% identity to a polypeptide 
comprising SEQ ID NO:47, SEQ ID NO:49 or SEQ ID NO:56. 

30 
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8. The isolated polynucleotide of claim 1 wherein the nucleic acid segment 
encodes a polypeptide having at least 90% identity to a polypeptide 
comprising SEQ ID NO:47, SEQ ID NO:49 or SEQ ID NO:56. 

5 9. An isolated polypeptide encoded by the polynucleotide of any one of 
claims 1 to 8. 

10. An expression cassette comprising a promoter operably linked to the 
polynucleotide of any one of claims 1 to 8. 

10 

11. A recombinant vector comprising the polynucleotide of any one of claims 
1 to 8 wherein the vector is capable of being stably transformed into a 
host cell. 

15 12. The vector of claim 1 1 wherein the polynucleotide is operably linked to a 
promoter operable in a eukaryotic host cell. 

13. The expression cassette of claim 10 or vector of claim 1 1 wherein the 
polynucleotide is in sense orientation. 

20 

14. The expression cassette of claim 10 or vector of claim 1 1 wherein the 
polynucleotide is in antisense orientation. 

15. The vector of claim 1 1 wherein the polynucleotide is operably linked to a 
25 promoter operable in a prokaryotic host cell. 

16. A host cell comprising the expression cassette of claim 10. 

17. A host cell comprising the vector of claim 1 1 . 

30 
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18. The host cell of claim 16 or 17 which is selected from the group 
consisting of bacteria, yeast, plant and mammal. 

19. A method for identifying an agent having fungicidal or mycocidal 
5 activity, comprising: 

a) contacting a fixngus with an agent that binds to the polypeptide of 
claim 9; and 

b) identifying an agent having fungicidal or mycocidal activity. 
10 20. An agent identified by the method of claim 19. 

21 . A method for identifying an inhibitor of a polypeptide, comprising: 

a) contacting a host cell which expresses a polypeptide encoded by 
the polynucleotide of any one of claims 1 to 8 with an agent; and 

15 b) identifying an agent that inhibits the activity of the polypeptide. 

22. An agent identified by the method of claim 2 1 . 

23 . A method of inhibiting the growth or pathogenicity of a fungus, 

20 comprising contacting the fungus with the agent of claim 20 or 22 in an 

amount sufficient to inhibit the growth or pathogenicity of the fungus. 

24. A method for identifying an agent having fungicidal or mycocidal 
activity, comprising: 

25 a) contacting a fungus with an agent that inhibits the activity of the 

polypeptide of claim 9; and 

b) identifying an agent having fungicidal or mycocidal activity. 

25. A method for identifying an agent that modulates a polypeptide 
30 associated with pathogenicity of a fungus, comprising: 
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a) contacting a fungus with an agent that binds the polypeptide of 
claim 9; and 

b) identifying an agent that modulates the pathogenicity of the 
fungus. 

5 

26. A method for identifying an agent that modulates the pathogenicity of a 
fungus, comprising: 

a) contacting a fungus with an agent that inhibits the activity of the 
polypeptide of claim 9; and 
10 b) identifying an agent that modulates the pathogenicity of the 

fungus 

A method of identifying agents that alter the phenotype of a fungal 
pathogen or mycogen, comprising: 

a) contacting an agent to be tested with one or more cells of a fungal 
pathogen or mycogen which comprises a nucleotide sequence 
encoding a polypeptide that is substantially similar to SEQ ID 
NO:47, SEQ ID NO:49, or SEQ ID NO:56; and 

b) detecting or determining whether the agent selectively modulates 
expression or function or metabolic pathways associated with the 
polypeptide, thereby altering a phenotype of the cells relative to 
cells not contacted with the agent. 

28. The method of claim 27 wherein the polypeptide is associated with 
25 virulence or pathogenicity. 

29. The method of claim 27 wherein the agent alters the activity of the 
polypeptide. 

30 30. The method of claim 27 further comprising identifying an agent having 
fungicidal, mycocidal or anti-pathogenic activity. 
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3 1 . The method of claim 27 wherein cellular growth is detected or 
determined. 

5 32. The method of claim 27 wherein the activity of the polypeptide is 
detected or determined. 

33. The method of claim 27 wherein virulence is detected or determined. 

10 34. The method of claim 27 wherein the pathogen expresses the polypeptide. 

35. The method of claim 27 wherein the pathogen does not express the 
polypeptide. 

15 36. A method of identifying agents that alter the phenotype of a fungal 
pathogen or mycogen, comprising 

a) contacting an agent to be tested with one or more cells of a fungal 
pathogen or mycogen wherein the cells have a mutation in a 
nucleic acid sequence corresponding to the polynucleotide 

20 according to any one of claims 1 to 8 which mutation results in 

overexpression or underexpression of the encoded polypeptide; 

b) detecting or determining whether the agent selectively modulates 
expression or function or metabolic pathways associated with the 
polypeptide, thereby altering a phenotype of the cells relative to 

25 one or more wild type cells not contacted with the agent. 

37. The method of claim 27 or 36 wherein the pathway is associated with the 
production of a toxin or siderophore. 

30 38. The method of claim 27 or 36 wherein the pathway is associated with 
iron metabolism, uptake or absorption. 
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39. The method of claim 27 or 36 wherein the pathway is associated with 
growth, virulence or pathogenicity. 

40. An isolated antibody which specifically binds to the polypeptide of claim 
5 9. 

41 . The antibody of claim 40 which is a monoclonal antibody. 

42. The antibody of claim 40 which is a polyclonal antibody. 

10 

43. The method of claim 19, 23, 24, 25, 26, 27 or 36 wherein the fungus is a 
recombinant ftingus. 

44. The method of claim 43 wherein the fungus comprises a recombinant 
1 5 DNA molecule which encodes the polypeptide. 

45. The method of claim 44 wherein the recombinant DNA molecule is 
overexpressed. 

20 46. The method of claim 44 wherein the ftingus comprises an antisense 
recombinant DNA molecule for the polypeptide. 

47. The method of claim 44 wherein the genome of the fungus is disrupted so 
that the endogenous gene which encodes the polypeptide is not 

25 expressed. 

48. A therapeutic method comprising: administering to an animal suspected 
of being infected with a fungal pathogen an effective amount of the agent 
of claim 19 or 22. 

30 
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A method to prevent or inhibit infection of an animal or plant by a fungal 
pathogen, comprising: administering to the animal or plant an effective 
amount of the agent of claim 19 or 22 for a time and under conditions 
sufficient to inhibit or prevent fungal growth or reproduction. 

The method of claim 51 or 52 wherein the animal is a human. 

The method of claim 51 or 52 wherein the agent is topically 
administered. 

A nucleic acid sequence of a polynucleotide of any one of claims 1 to 8. 

The nucleic acid sequence of claim 52 which is stored on a computer 
readable medium. 

An amino acid sequence of a polypeptide of claim 9. 

The amino acid sequence of claim 54 which is stored on a computer 
readable medium. 

The method of claim 48 or 49 wherein the animal is 
immunocompromised. 

The method of claim 48 or 49 wherein the animal has 
Coccidioidomycosis. 

The method of claim 48 or 49 wherein the animal is subjected to 
immunosuppressive therapy. 

The method of claim 48 or 49 wherein fungal iron metabolism is 
inhibited. 
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The method of claim 49 wherein the agent is administered to a plant. 

The method of claim 60 wherein the agent is administered by spraying. 

A transformed plant, the genome of which expresses a chimeric DNA 
molecule which encodes a gene product which confers resistance or 
tolerance to the plant to a fungal pathogen by inhibiting fungal iron 
metabolism or siderophore production. 
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Ammo-add activating modules 



Type I W^f^^^^in about 600 aa 

Core sequence 12 9 4 5 6 



SafBl 

12 3 4 



type II N^W ^m^^ F? >1000aa 
Core sequence 1 2 3 4 5 6 

Domains 



amino acyiadenylate formation: cores 1-5 (cores 2, 3, and 4: 
ATP binding; core 4: ATPase; core 1 : unknown function) 

gjrj Thioesfer formation (4* phospliopantethein binding), 
core sequence 6 only, 

E§3 Amino add Kf-methylation (>40O aa> 



EnraaiinB 

(DHrV-MeVaOa > . DHnr « D-2-hydroxyisovaleric acid; 

CyclospoiinA 

MeBmt- (4R)-^(B>-2-butenyI^L--fceonine 
Abu^ a-amina butyric add; Sar— sareosine 
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Sequences producing significant alignments t 

BP 1010250 1 YP22 SCHPO HYPOTHETICAL 170.7 KD PROTEIN C56F8.02. 
Sp 1 009773 jYA84 SCHPQ HYPOTHETICAIi 162.4 KD PROTEIN C22F3.04. 
ref tNP 014736,1 1 Yor093cp iSaccharoinyces cerevisiae] >gi}21* 
gb|AAF64300«l) (AF246991) unknown [Drosophila melanogaster] 
SP } 014689 1 Y184 HUMAN HYPOTHETICAIi PROTEIN KIAA0184 >gi|l!36. 
SPIQ9Y2E4 |Y934^jUMAN HYPOTHETICAIi PROTEIN KIAA0934 >gi|4589- 
dbi lBAA95987/lT (AB040896) KIAA1463 protein [Homo sapiens] 
pirl IT34061 hypothetical protein F28B3.1 - Caenorhabditis e. 
qbiAAF473$4.l| (AE003467) CG7020 gene product [Drosophila m. 
pir| IT34918 polyketide synthase - Streptotnyces coelicolor >. 
oblAAG02359.,l|AF210249 18 (AF210249) peptide synthetase NRP. 
pirl 1T18551 safrauiycin Mxl synthetase B - Myxococcus xanthu. 
splQl0976|YT30 MYCTU HYPOTHETICAL 67.9 KDA PROTEIN RV2930 >. 
gb|AAG0S812.llAE004669 9 (AE004669) probable non-ribosoroal . 
pirl 1A70635 probable fadD31 protein - Mycobacterium tubercu. 
piri IC70669 probable acyl-CoA synthetase (EC 6 . )\ • - Myc. 
pirl IF70522 probable polyketide synthase - Mycobacterium tu. 



Score E 
(bits) Value 
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pirj |B70668 probable Acyl-CoA Synthetase - Mycobacterium tu 

pir| [C70634 probable fadD30 protein - Mycobacterium tubercu 

<?b [AAB52538.il (U75685) acyl-CoA synthase [Mycobacterium bo . . ; 
emb|CAB36629,,l[ (AL035480) putative acyl-CoA synthase [Myco. „ . 
pirl [T31307 type I fatty acid synthase homolog - Cryptospor. . . 
sp|Q50S8g|YF21 MYCTU HYPOTHETICAL 63.1 KDA PROTEIN RV1521 >. . . 
pirl IS73072 u0002r protein - Mycobacterium tuberculosis >gi... 
pir[ IB70820 probable polyketide synthase MTCY409 - Mycobact. . . 

pirl 1A70877 probable acyl-coAsynthase - Mycobacterium tuber 

pir| |E70887 probable fadD32 protein - Mycobacterium tubercu. . . 
pir[ [S72716 4 -coumarate-- CoA.. ligase homolog - Mycobacterium... 
o;blAAC83455.l[ (AF117694) malonyl CoA synthetase [Rhizobium. . . 
splP94547|LCFA BACSU LONG -CHAIN- FATTY- ACID- -COA LIGASE (LON. . . 
pirl |H72454 probable long- chain- fatty-acid — CoA ligase APE2... 

pir[ [T03221 probable polyketide synthase module 1 - Strepto 

sp|P29212(LCFA ECOLI LONG- CHAIN- FATTY-ACID- -COA LIGASE (LON. . . 
gblAAF86393.llAF235504 14 (AF235504) FkbB [Streptomyces hyg. . . 
sp|P46450|LCFA HAEIN LONG- CHAIN- FATTY- ACID- -COA LIGASE (LON. . . 
sp|P25464lACVS CBPAC DELTA- (L-ALPHA-AMINOADIPYL) -L-CYSTEINY. . „■ 

pir| [E69378 long-chain-f atty-acid--CoA ligase (fadD-5) homo 

crb[AABQ9715.1 1 (U12891) ORF5 fPseudomonas aeruginosa] 
gb|AAFP0957.l|AF183408 5 (AF183408) McyG [Microcystis aerug. . . 
sp 1 030408 I TYCB BACBR TYROCIDINE SYNTHETASE II [INCLUDES: AT... 
dbT |BAB12213.ll (AB032549) peptide synthetase and polyketid. . . 
cjb|AAF28840,lfAFll8888 l (AF118888) malonyl CoA synthetase . . . 
pir) |A61209 hypertension-associated protein SA - rat 
gb I AAF95133.lt (AE004273) long- chain- f at ty-acid--CoA ligase. . . 
emb|CAAP5324.l( (AJ005061) LchAB protein [Bacillus lichenif... 
cjblAADQ4758«l) (U95370) lichenysin synthetase B? LicB [Baci..* 

cfblAAF41909.l| (AE002506) long-chain- f atty- acid- -CoA ligase 

emblCAB84971.ll (AL162757) long-chain- f atty-acid--CoA-ligas . . . 
crb[AAF08797.l|AF184956 4 (AF184956) MycC [Bacillus subtilisj 



pir 



pxr 



IS19S60 proline-rich protein MP4 - mouse >gi | 53182 |emb | . . 



BAB06823 . 1 | (AP001517) long-chain fatty-acid-CoA ligase. 



{T07943 probable AMP-binding protein - rape >gi J 1617272 . . 



|T18841 hypothetical protein C01G6.7 - Caenorhabditis e.. 



gb|AAF02529.llAF150669 1 (AF150669) long- chain- f at ty- acid- C. 



emb 



dbi 



dbi 



pxr 



pxr 



pxr 



ref 



pir 



dbi 



pir 



CAB38084.l[ (AJO 06977) Tal [Myxococcus xanthus] 



BAA3714l.l| (AB022340) SA [Mus musculus) 



|T17428 FK506 polyketide synthase - Streptomyces sp. (s... 



BAB10742 .1 1 (AB023035) 4 -coumarate -CoA ligase-like prot 



H69545 probable fatty-acid — CoA ligase (EC 6.2,1.-) fa.. 



T3 0.226 polyketide synthase - Streptomyces hygroscopicu . . . 



T17463 rifamycin polyketide synthase modules 1-3 - Amy. . 



NP 058566.1 1 SA rat hypertension- associated homolog [Mu. 



} YGBSG1 phenylalanine racemase (ATP-hydrolyzing) (EC 5 . . . . 



BAB01855.1) ' (AP000377) long-chain- fatty- acid CoA ligase. 



[T05038 4 -coumarate- -CoA ligase homolog F13C5.180 - Ar a . . 



sp 1 068006 1 BAC^BACLI BACITRACIN SYNTHETASE 1 (BAD [INCLUDE... 



pir| 



pxr \ 



pirj 



H69354 probable fatty-acid — CoA ligase (EC 6.2.1*-) fa. 



B48013 proline-rich proteoglycan 2 precursor r parotid 



B69768 probable acid- -CoA ligase (EC 6.2.1.-) ydaB - B. 



H71401 probable A6 anther- specif ic protein - Arabidops. 



QblAAF00961.l|AF183408 9 (AF183408) McyB [Microcystis aerug. . 
crb|AAB92395.ll (U97078) microcystin synthetase B [Microcyst- - 
embjCAA78044.ll (Z12000) AngR protein [Vibrio anguillarura] 
pir| [T36248 CDA peptide synthetase I - Streptomyces coelico.. 
sp| P19828lANGR VIBAN ANGR PROTEIN >gi | 68686 j pir j J XPVCAR. ang , . 
Qb|AAF2692S.l IAF210843 22 (AF210843) nonribosomal'peptide s . . 
pirl ID69187 probable acid--CoA ligase (EC 6.2.1.-) MTH657 
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Sequences producing significant alignments: 

sp1Q102501YD22 SCHPO HYPOTHETICAL 170.7 KD PROTEIN C56F8.02. 
ref iNP 014736.1 I Yor093cp [Saccharorayces cerevisiae] >gi|21. 



sp 



SB. 



9b 



pir 



Q09773|YA84 SCHPO HYPOTHETICAL 162.4 KD PROTEIN C22F3.04, 
Q146891Y184 HUMAN HYPOTHETICAL PROTEIN KIAA0184 >gi|ll36. 
Q9Y2E4|Y934 HUMAN HYPOTHETICAL PROTEIN KIAA0934 >gi|4589, 
AAF64300 , 1 f js»g <AF246991) unknown tBrosophila melanogaster] 
1T18551 s'a*f ramycin Mxl synthetase B - Myxococcus xanthu . 



pir I IT34061 hypothetical protein F28B3.1 - Caenorhabditis 
sp I Ql 0 97 6 | YT3 0 MYCTU HYPOTHETICAL 67.9 KDA PROTEIN RV2930 >. 
pir I IC70669 probable acyl-CoA synthetase (EC 6.2.1,-) - Myc. 
pir | IS73072 u0002x protein - Mycobacterium tuberculosis >gi . 
pir) 1F70522 probable polyketide synthase - Mycobacterium tu. 
emblGAB36629.l| (AL035480) putative acyl-CoA synthase {Myco, 
c?b|AAG02359.1 1AF210249 18 (AF210249) peptide synthetase NRP. 



Hir 



pir 



pxr 



B70820 probable polyketide synthase MTCY409 - Mycobact. 
A70877 probable acyl-coAsynthase - Mycobacterium tuber. 
A70635 probable fadD31 protein - Mycobacterium tubercu. 



Score E 
(bits) Value 



819 
277 
169 
89 
86 
81 
80 
73 
72 
71 
68 
66 
65 
63 
62 
61 
59 



0.0 
3e-73 
le-40 
3e-16 
^e-15 
^£e-14 
^e-13 
le-11 
2e-ll 
6e-ll 
3e-10 
le-09 
4e-09 
le-08 
2e-08 
5e-08 
2e-07 
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pir[ {T31307 type I fatty acid synthase homolog - Cryptospor. - . _59 2e-07 

sp|Q50586|YF2l MYCTU HYPOTHETICAL 63.1 KDA PROTEIN RV1521 >. . . _59 2e-07 

qb 1AAF95133..11 (AE004273) long - chain- fatty- acid-- CoA ligase.. • __59 2e-07 

pirl IB70668 probable Acyl-CoA Synthetase - Mycobacterium tu... J59 2e-07 

crb|AAB52538.1 j (U75685) acyl-CoA synthase [Mycobacterium bo. . . _59 3e-07 

pirl [S72716 4-coumarate — CoA ligase homolog - Mycobacterium 57 6e-07 

pirl IE69378 long- chain- fatty- acid- -CoA ligase (fadD-5) homo... _54 7e-06 

qblAAC83455.ll (AF117694) malonyl CoA synthetase [Rhizobium. . . __S4 7e-0 6 

o;b|AAG05812.l(AE004669 9 (AE004669) probable non-ribosomal ... _53 le-05 

emb|CAA70871.l} (Y09700) rpfB [Xanthomonas campestris] _52 2e-05 

pirl fT34918 polyketide synthase - Streptomyces coelicolpr >. . . 52 4e-05 

sp[P29212lLCFA ECOLI LONG- CHAIN-FATTY- ACID- -COA LIGASE (LON 52 4e-05 

Sp I P4 64 50 | LCFA HAEIN LONG - CHAIN - FATTY -AC ID-- COA LIGASE (LON... _52 4e-05 

qb|AAB09715.ll (U12891) ORF5 [Pseudomonas aeruginosa] _50 le-04 

pirl 1A45062 long-chain-fatty-acid — CoA ligase (EC 6.2.1.3} ... 49 2e-04 

gb|AAF83100.l)AE003882 2 (AE003882) regulator of pathogenic. . . 49 2e-04 

pirl IH72454 probable long-chain- fatty- acid — CoA ligase APE2... 49 2e-04 

gblAAF28840.l|AF118888 1 (AF118888) malonyl CoA synthetase ... _49 2e-04 

pir| [T03221 probable polyketide synthase module 1 - Strepto 48 4e-04 

grb|AAF86393.l|AF235504 14 (AF235504) FkbB [Streptomyces hyg. . . 48 5e-04 

Qb I AAF79612 . 1 1 AC027665 13 (AC027665) F5M15.18 [Arabidopsis ... _47 7e-04 



PIT 



dbi 



pir 



pir 



I S73071 u0002q protein - Mycobacterium tuberculosis >gi. . . 47 7e-04 



BAA95987.l| (AB040896) KIAA1463 protein [Homo sapiens] 46 0.001 



I YGB5G1 phenylalanine racemase (ATP-hydrolyzing) {EC 5 46 0.001 



|T07908 4 - coumara te- - CoA ligase (EC 6.2.1.12) 2 - weste 46 0.002 



cyb|AAF47364.l| (AE003467) CG7020 gene product [Drosophila m. . . 45 0.003 

gb[AAF41909»l j (AE002506) long -chain- fatty- acid- -CoA ligase. . . _45 0.003 

emblCAB84971.l| (AL162757) long- chain -f at ty~ acid— CoA- 1 igas. . . 45 0.003 

gfb|AAP39590.l|AC007858 4 (AC007858) This gene is a member o. . . _45 0.005 



dbi 



pir 



BAB06823.ll (AP001S17) long- chain fat ty-acid- CoA ligase. . . 44 0.006 



IC70634 probable fadD30 protein - Mycobacterium tubercu. . . 44 0.008 



qrb[AAF08795.l|AF184956 2 (AF184956) MycA [Bacillus SUbtilisJ _44 0.008 

pirl 1E69438 probable fatty-acid — CoA ligase (EC 6.2.1.-) fa 43 0.010 

pdb 1 1 AMU I B Chain B, Phenylalanine Activating Domain Of Gram 43 0.013 

pirt IT18841 hypothetical protein C01G6.7 - Caenorhabditis e 43 0.013 

SP|P14687|GRSA BACBR GRAMICIDIN S SYNTHETASE I [INCLUDES: A... _43 0.013 

gblAAA5871B.il (M29703) grsA- encoded protein [Brevibacillus . 1 . 43 0.013 

Sp|P94547|LCFA BACSU LONG - CHAIN- FATTY - ACID- - COA LIGASE { LON . . . _42 0.023 

pirl [E70887 probable fadD32 protein - Mycobacterium tubercu... 42 0.023 

qb|AAF79611.l [AC0^27665 12 (AC027665) F5M15.i7 [Arabidopsis . _42 0.030 

pirl IT17463 rifamycin polyketide synthase modules 1-3 - Amy... 42 0.030 

pirl IC69471 probable fatty-acid- -CoA ligase (EC 6.2.1.-) fa 42 0.030 

sp{Q01886lHTSl COCCA HC- TOXIN SYNTHETASE (HTS) >gi 1 167219 }g. . . _41 0.039 

pirl IA45086 HC- toxin synthetase - fungus (Cochliobolus carb. . . 41 0.039 

pirl |T07943 probable AMP-binding protein - rape >gi | 1617272 . 41 0.039 

gb|AAD40664.l|AF150686 1 (AF150686) 4 -coumarate: coenzyme A ... 41 0.039 

sp|P31685|4CL2 SOLT0 4-COUMARATE- -COA LIGASE 2 (4CL) >gi|l0... _41 0.051 

pirf [T30226 polyketide synthase - Streptomyces hygroscopicu. . . 41 _ji.051 

sp 1 P14912 I 4CL1^ETCR 4-COUMARATE — COA LIGASE 1 (4CL) >gi|82... Jl ^.051 

gblAAD34542.llAF139644 1 (AF139644) lucif erase [Phrixothrix. . . _41 0.051 

pirl )T28932 probable 4-coumarate- -CoA ligase (EC 6.2.1.12) 41 0.051 

sp|P31684 |4CL1 SOLTU 4-COUMARATE- -COA LIGASE 1 (4CL) >gi|l0... _41 0.051 

pir | ITQ9755 4-coumarate--CoA ligase (EC 6.2.1.12) 4CL2 - lo. . . _41 0.051 

pir I [T05038 4-coumarate — CoA ligase homolog F13C5.180 - Ara. . . 41 0.051 

2 (4CL) >gi|28... _41 0.051 

(AF052221) 4-coumarate — CoA ligase. . . 41 0.067 

(AF052222) 4-coumarate — CoA ligase 41 0.067 



cfb[AAF37732.l| 


AF052221 1 


o,b|AAF37733.l| 


AF052222 1 



pirf )T03390 4-coumarate- -CoA ligase" (EC 6.2.1.12} ^isoform 2 41 0.067 

pirf IT17428 FK506 polyketide synthase - Streptoraydes * sp. (s... 41 0.067 
emb|CAB81058.l| (AL161502) 4 -coumarate-- CoA ligase- like pro . _41 0.067 
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pirl {T07909 4 - coumarate ~ - CoA ligase (EC 6.2.1.12) 1 - weste. 





AAB18637 


.1 


(U50845) 


S?b 


AAF02529 


.1 


[AF150669 


1 


flb 


AAF91310 


.1 


AF239687 


1 


ffb 


AAG29784 


.1 


AF235050 


7 


qb 


AAG06688 


.1 


AE004752 


4 



(AF150669) long~chain-fatty-acid-C. - 
(AF239687) 4-couraarate*coA ligase . - 
(AF235050) putative ligase {Strept.. 
(AE004752) long- chain- fatty- acid — • . 
long-chain fatty acid- -CoA ligase - Deinococcus . . 
probable fatty- acid- -CoA ligase (EC 6.2.1.-) fa.. 
sp|O30408lTYCB BACBR TYROCIDINE SYNTHETASE II [INCLUDES: AT. . 



prr 



[C75364 



IH69354 



pir 



dbj 



pir 



emb 



Pir 



eirib 



dbj 



1H69545 probable fatty- acid - - CoA ligase (EC 6.2.1.-) fa.. 
BAA05006.ll (D25416) luciferase {Photuris pennsylvanicaj 
probable acid- -CoA ligase (EC 6.2.1.-) fadD35 -.. 
pks002a protein - Mycobacterium tuberculosis >g. 



{A70551 



IS73073 



sp|P17814| 4CIi ORYSA 4 -COUMARATE- -COA LIGASE >gi { 824 54 j pir | J . . 



CAB84715.ll (AL162756) putative acyl-CoA ligase * [Neisse, 



[H69371 probable acid — CoA ligase (EC 6.2.1.-) AF0976 - 
[A70669 probable acyl-CoA synthetase (EC 2.3.1.-) - Myc. . 
CAA53230.l[ (X75542) 4 - coumarate : CoA ligase [Vanilla pl.- 
fBAAOSQOS.l f (D25415) luciferase [Photuris pennsylvanicaj 
gb|AAF41652.1[ (AE002476) long -chain- fatty- acid- -CoA ligase. . 



<?bi 



pir 



BAB01715.1) (AB023045) 4-coumarate:CoA ligase [Arabidop.. 

probable fatty acid — CoA ligase - Strep tomyces . . 
probable A6 anther- specific protein - Arabidops.. 



IT36202 



[H71401 
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crb|AAG097B8.l)AF254143 1 ^^fs^tfrrepressed by TUP1 prote. . . 
ref |NP 014458.1) Similar to ferric reductases Frelp and Fre ... 
splP78588|FREI, CANAL PROBABLE FERRIC REDUCTASE TRANSMEMBRAN . . . 
emb|CAB45649.l| (AJ387722) ferric reductase [Candida albicans] 

Ferric (and cupric) reductase; Frelp [Sacc 

similar to FRE2; FreSp [Saccharomyces cere... 
Fre7p [Saccharomyces cerevisiae] >gi|39136... 
similar to FRE2; Fre6p (Saccharomyces cere. . . 
SP|Q04800|FRP1 SCHPO FERRIC REDUCTASE TRANSMEMBRANE COMPONE. . . 

pir| {T40777 ferric reductase transmembrane component - fiss 

emb lCAB91820.il (AL356192) related to ferric reductase [Neu 

gb { AAC39478 . ljt § (AF055356) respiratory burst oxidase protein 

emblCAB70727.r^ (AL137404) hypothetical protein [Homo sapiens] 
pir 1 IS23737 proline-rich protein precursor - kidney bean >g. . . 
cfb|AAF46035.l{ (AE003434) CG15784 gene product [Drosophila ... 
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[BAA95995.l( (AB040904) KIAA1471 protein [Homo sapiens] 

IT14756 hypothetical protein DKFZp564F0923 . 1 - human (f 

QblAAB70928.l( (AF020261) proline rich protein [Santalura al 

pirj |T04859 extensin homolog F28A21.80 - Arabidopsis- fchalia. . . 
gb| AAF87118 . 1 | AC006434 14 (AC0 06434 ) Fl OAS. 23 [Arabidopsis ... 
pir | 1S51798 gamma-kaf irin precursor - sorghum 
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embjCAA44347,l| (X62480) gamma-kaf irin preprotein [Sorghum . - . 
dbj |BAA9992l.il (AP001306) contains similarity to cell wall-... 
sp|P13983|EXTN TOBAC EXTENSIN PRECURSOR (CELL WALL HYDROXYP... 

ref |NP 013148,1 1 Ylr047cp [Saccharomyces cerevisiae] >gi|21 

pir 1 1T14355 protein- tyrosine-phosphatase (EC 3.1.3.48) TD14 

cfb|AAA73078.l[ (M73688) [Sorghum bicolor endosperm tissue ra 

spfP48038|ACRO RABIT ACROSIM PRECURSOR >gi j 1085468 | pir | |S47 . 
pir ( 1T00264 high carbon dioxide response protein 2 - Chloro . . . 
gb|AAC82365.l| (AF055904) unknown [Myxococcus xanthus] 



Pirl 



A54523 histidine-rich protein - Plasmodium lophurae (f < 



T28682 hypothetical protein - Streptomyces coelicolor 



T06291 extensin homolog T9E8.80 - Arabidopsis thaliana 



sp|P06599|EXTN PAUCA EXTENSIN PRECURSOR >gi | 82047 |pir| | A243 . . . 
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pirl IS25299 extensin precursor - tomato >gi| 170444 |gb J AAA34. . . 

cfbiAAF50413.l| (AE003555) Gug gene product {Drosophila mela 

pirl IT33997 hypothetical protein W03G1 . 5 - Caenorhabditis e . . . 
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AF217844 1 (AF217844) GRUNGE fDrosophila mela. 
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(AF055355) respiratory burst oxidase protein.. 
NP 031989.1 1 ecotropic viral integration site 1 [Mus mu 
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SP|P14404 [EVI1 . MOUSE ECOTROPIC VIRUS INTEGRATION I SITE PRO . . 
emb \ CAB76093 . 1 > (AL157956) putative oxidoreductase [Strepto... 
emb|CAA04777.l[ (AJ001482) Evildelta 105 [Mus musculus] 
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pirl 1PQ0475 pistil extensin- like protein (clone pMG04) - co. . . 
qb|AAF73291.l|AF155232 1 (AF155232) extensin [Pisum sativum] 
cfb|AAD37405 llAF148222 1 (AF148222) merozoiter surf ace prote .. . 
pir) JA48232 cysteine-rich extensin-like protein 1 precursor... 
reflNP 057309. it RhD type Ilia protein [Homo sapiens] >gi|6 r . . 
ciblAAC38842.il (AF010462) merozoite surface protein 2 [Plas... 
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(AF217011) merozoite surface protein 2 [Plas 

ref (XP 001241. ll similar to hypothetical protein FLOT20337 ( 

o;b|AAC18092 .lj (AF056965) mutant membrane protein RhCe [Horn... 

pir [ 1T10863 extensin precursor - kidney bean >gi 1 727264 jgb| 
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pir ( [B29356 hydroxyproline-rich glycoprotein precursor (clo. . . 
dbj [BAB0195l.l{ (AP002048) extensin-like protein [Arabidops . . . 
grb|AAF34294.l|AC005941 6 (AC005941) L5204.7 [Leishraania major] 

qb[AAD24546.2 1 (AF116856) neurocan core protein precursor [ 

pir{ |S40517 erythrocyte membrane protein - human 
ref [NP 057208.1 1 Rhesus blood group, D antigen; Rhesus syst. . . 
pir | |S40516 erythrocyte membrane protein - human 
emblCAB09722.l| (Z97026) rhesus D category VI type III prot... 
qb|AAD25300.l|AF088276 1 (AF088276) NADPH oxidase* gp9!; ph. . . 
ojb | AAB34 66 0.1 1 RhPI-2e=Rhesus blood group antigen isoform {... 
pirl 1T25800 C2H2- type "zinc finger domain, WT1 homolog - Cae 
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pirl tsi2549 hypothetical protein - human herpesvirus 4 



2k 
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3k 



&AF57886.l| (AE003804) CG17288 gene product [Drosophila . . 
AAD52161.i1af143503 l (AF143503) nitric oxide synthase 



abj 1BAA35135.1} (AB0O8227) Extensin [Adiantum capillus- vene - 



AAB346S9.ll 



Rht>I - ld=Rhesus blood group antigen isoform {. 

emb|CAB65664.1f (AJ252251) glycoprotein G- 2 [human herpesvi . . 
reflNP 044534.1) virion glycoprotein G [human herpesvirus 2 . . 
pirl IS78480 Rhesus blood group antigen-like protein isoform. . 
ref|N£ 00S057.l| splicing factor proline/glutamine rich (po.. 
gb|AAF59500.ll (AC024805) Hypothetical protein Y51H7C.a [Ca. . 
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BAA81899.lj (AB0189G6) Rh blood group D antigen (RhD) 
NP 065231.1 [ Rhesus blood group, CcEe antigens; Rhesus 
1T29299 hypothetical protein C50F7.2 - Caenorhabditis e. . 
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549 
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<: L G K 1 
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ScFre3p . PRO 
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• ■ i 



610 

!_ 



620 



433 
579 



GFNVLEAYKPELMCLEDLNVQLHIYNTME 
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Alignment Report of Untitled, usir-i COistal method with PAM250 resfc* * weight table. 



WAATPDDSLKISQQPBKADGKGVVMATTIiE Majority 

640 650 660 



434 




A T 


p 


D0 


□ El B Bj 


609 


OA 




p 


l N EI 


s leisqqdekadgkgvvmattleI 



Feared, pro 
ScFre3p.PRO 

QSDDAVB FGGTVHHDGRPTyBKLIsQEVGTIi Majority 

690 




£■31 KjJN Fared. pro 



D H ScFre3p.PRO 



NGSLAVVCCGPPVFVTBVRAQGA1 L V L E K P Majority 



700 



710 



720 



468 


Q 


S H 


M 


P 






B- - — p 
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rBSh 






669 


W 
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Li 


A 
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C C G P P V 


F 


vraE 


V I 


^BqBaEIl vlskp 
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K A 
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Majority 



Fered*pro 
ScFre3p.PRO 



Decoration 'Decoration Sliade {with solid black) residues that 

match the Consensus exactly. 
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CCTTGGrAGTGCACTGGGCGTGTGAATAGCTGTCAGCACGCCCGCTGGCGGTTCGCGCCATGGTGGAGATTTTGCACGC 
GACaTGGACGACGACGGCCTCGGC^GCGTGAGGAACATGTCAAAATGAACCCAGGGGTGCATCAAAGCCGTTTTACCTr 



_ - - -3AAAGAGGGACGAGACAGGC 

GGGTGTGGCGGGCGATAGTGCGGGCGAGAGGGAAAGAGAGTTGGAGAGAGGGAACCATGGTGTTrGCTCGCGGCGTCGr 

GCATCGTGCAGGGCGCGTCGGCCCATCATGGCATGGTCATTGGCATCATCGGTACGTACATGGCCGCAAGCATCATGAG 

CACAGCGGCCATTTGCATTTCTCGTGGCCGTGGTGGCCGTGGCAAGCTGGTGTCTGTGGCGCGGCGTCTGCGTGTGTGG 

ATCGCCAGGCACCGCGGCGTGATTGGCACAGGCAGGCGAGGGCGGGCATGGATGGATGCAGCGTGTGCAGTCGAATCCG 

AAGCACAGCGCAGGGCCGGCAGCCGGCACGTCTGTGGTGGGCGTGTGCAGGCTGGGCGGCGGTGCCGAGTGGGATGGGG 

ATGATAGGATGGATGGGCAGTCTGGCGAGGGTGGGCTGAGGGGCCTTGGTTTCATGGCGCTTCCAGGCACGCGGGCCTG 

AGTGGCCAGC^TTCGCAGCCTCCGCCGCCCCGTTTCACCCGCTAGCAGCTGGTCGGACC^GCCC^GCCAATCGTCCGC 

CTGGCCCTGGCGCCCCCCCTGGTCGCTGCCGGCCAGGCCACCCCCTCGCGCGAACCTCATATCCACTGTTTCCCTCGCT 

CCCTCTCCGACCTC^CGCTCGCCGTCCCTGCTGTCTGGCTCTCTGCCCAC^TTCGTTTCCACTGTTCCTCACTCGCTAC 
GGGAACO^AAAAAC^CGCTCCCTGCCCGCCTGCTGCC^^ 

CAAGCCCATTC^CCCC^GCCGCGCCCTGGCGACCATCAATCATTrc^ 

CCCGATACTGTOC^GGCCACCCCCGCCCCTC^^ 

ATGCATTTCCTGTGACCACGCCTGTCCTCT^ 

GGACCACCTGGCTACACAGCGACTACATTC^CAACAACATCAACTGC 

GCCTCAATACTCGCCCATCGCCCAATTCCCCCCTTTAGCACCCGCCAGCATGTCTCTCTCCGGCCTGCTGCGCTCGCGG 
GAGGCACCTOCTGCC^GCGT^CCTCCTCT^ 

TCACTGGTGTCGACCAAGTCGGCAACTTCTTGTGGGTCGACACCTTTCTCTACATGCTGATTGGCATCTCTGGCATGCT 
CCTCATGCTCCGCATCTCCAACATGGTCTGGAAGCACAGC^ 

TGGGAGACCAACCGAACAAGCTGGTGGCCCTGGCTCAACCGCCAGATCCTCGTCGCCCCGCTCTGGAAGAAGAAGCACA 
ACGCCCAGTTCCAGATCAGCAGCGCGATTGACAACGGAACCCTC^ 

CGTCGGCCTCAACGTTGCATGGTGCCTTGCCCTCCCCTACGACGTCCTCGACCACAGGGAGACGCTCGCCGCCCTTCGT 
GGACGCTCTGGAACCCTCGCCGCCCTC^CCTCATCCCCACCATCCTCTTCGCCCTCCGCAACAACCCCCTCACTCCCT 
TCTCCAGGTCTCGTACGACGACTTCAACCTTTTCGR.CCGCTGGGCTGCCCGAATCACC^TTGCCGAGGCCAT1^TCCAC 
ACTGCCGCTTGGTTGTACAACACCAAGGCTGGCGGTGGA'TGGCACGCCGTCGTAGCTGCCCTCCACACCGAGGGCTCTT 
ACGGA1X3GGGC^TGGGCGGAACTGTCGCCTTCACC1^CATCGGC^TCCAGGCCTGGTCCCCATTCCGTCACGCCTTTTA 
CGAGACCTTTCTCAACATCC^CCGCGTGATGGTC^TTGCTGCTCTCCTCGGCTTGTACAAGCACCTGGAGCTGCACGCT 
CTGCCCCAGGTCCCATGGATGTACCTCATCTTCATCTTCTGGGCGGCTGAGTGGTTCCTCCGCCTGTGCTCCATCTGCT 
ACTACGGCTTCAGCCTGAAGCAACGCTCTTCCATCACCGTCGAGGCCTTGCCTGGCGAAGCTGTCCGTCTAACCATCAA 
CATGGTCCGCGAATGGACCCCCCGTCCCGGATGTCACGTGCACATGTGGATGCCTCGCCTCTCCCTATGGTCCTCGCAT 

CCATTTTCCGTCGCCTGGGCTGCGACCCCTGACCGACGACTCCAAAGAGATGACGCTTCCCACTTTGGAAGGCGACGTC 

ACCATGATCAATGGCCAACCC^GGAAATCAAAAC^^ 

ACAGGGCGATATTC 

RLRSARGTTWLHSE)YIHNNINCTGPHSGAPGSAPQYSPIAQFPPIAPA£M^ 

kKYSYGLTGVDQVGNFIiWVDTFLYMLIGISGMLLMLRISNMVWKHSRM 

LWKKKHNAQFQISSAIDNGTLPGRWHTIMIiLIYVGLNVAWCLAIjPYD^ 

lWPI,ISLI.QVSYDDPNI,FHRWAARXTIAEAIVHTAAWLyNTKAGGGWHAWAALHTBGSYGWGMGGT^^ 

PFRHAFYETFmiHRVMVIAALLGLYKHLELHA^ 

AVRIiTINMVREWTPRPGCHVHMWM^^ 
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Distribution of 127 Blast Hits on the Query Sequence 



{Mouse-over to show de fline and scores. Click to show alignments 



Color Key for ftllgnnenb Scores 
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0 250 500 750 1000 1250 1500 1750 2000 



.../blastcgi?Rm^7260S64I-1457-10805&DESCR^ 
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Score e 

Sequences producing significant alignments; (bits) . Value 

(AL390354) conserved hypothetical protein 
(AD136078) probable membrane transporter [S. . , 
(AL355920) putative MFS all ant oat e permease... 
PUTATIVE TRANSPORTER C11P3*18C >gij749-, 



emb 


CAB99379- 


1 


ernb 


CAB65616. 


1 


emb 


CAB91174 . 


1 



ref 



emb 



N P _ 0ll776.il Snalp is a high affinity nicotinic acid pi,., 



CAB63S40.l[ (AL133521) putative transporter [Schisosacc. . . 



ah I AAG07513 - 1 jAE004829 3 (AE004829) probable MFS transpbrte, 
cyb [ AAG05602 . 1 1 AE0 04648JL (AE004648) probable MFS transporter . . 
piLrj 1D6499 5 hypothetical protein b2246 T Escherichia coli 



sp|P76470 



SP P70786 



sp[Q44470 



YFAV_ECOLI HYPOTHETICAL 46*3 KD PROTEIN IN G&PC-A. 



TUB 3 AGRVI PUTATIVE TARTRATE TRANSPORTER >gi 19843*.. 



TUB4 AGRVI PUTATIVE TARTRATE TRANSPORTER >gi|8052.. 



etrib|CAB61275.l| (AL132991) putative transporter protein [St 
sp 1 013880 | YE1G SCHPO PUTATIVE TRANSPORTER C1B3.16C >glf7493 



erob 



emb 



ref 



emb 



pir 



CAB66219.1 



CAB75122.1 



(AL136503) probable transmembrane transport . 
(AL139075) transmembrane transport protein 



NP 012100.1 | Yill66cp >gi | 731893 | sp | P4 0445 |YIQ6_YEAST P 



CAAS6046-l| (Z37980) hypothetical 4-hydroxyphenylacetat 



{T41604 probable membrane .transport protein - fission y» . . 



[T39680 probable allantoate permease - fission yeast (S . . . 



crblAAB53495.1|AF144422_2 (AF144422) HpaX [Salmonella dublin] 



ref 



pir 



|T41345 probable allantoate permease - fission yeast (S. 



SP I Q13 8 7 9 I YE1F SCHPO • PUTATIVE TRANSPORTER C1B3.15C >gi 7493... 

ref | NP 012686 »l| allantoate permease; DalBp >gi 1 118233 1 sp I P 

SP 1 005181 [ PHT1PSBPU PHTHAIiATE TRANSPORTER >gi 1 295708 |dbj |B. . - 

pir | |C70818 probable ABC transporter - Mycobacterium tiiberc 

gb J AAD41517 . 1 ] AF152094_1 (AF1S2094) phthalate transporter 



ref 



ref 



MP 009957.1 | Amino acid permease; Fen2p >gi {140479 |sp{ P. 



NP_009333,lj putative permease; Seolp >gi{ 731298 |sp|P39. . . 



crb[AAG05650.l|AE004652_3 (AE004652) probable 2 -ketogluconat 



pir 



ref 



gb|AAD03552,l| (AF095748) putative phathalate permease C-te 



ref 



db± 



dbi 



gb 



ref 



dbi 



gb 



qb 



gb 



sp 



T4048S transmembrane transporter Idzlp - fission yeast. 



T40140 transmembrane transporter lizlp - fission yeast. 



NP 014479. l| Yoll63wp >gi | 2132861 1 pir j JS66862 probable 



NP 013 045.1 [ Yll055wp >gi j 1077324 |pir| j S50965 probable 



I A71619 membrane transporter PFB0275w - malaria parasit. 



BAA87280.1] (AB027976) Hypothetical protein [Schizosacc 



{T12997 hypothetical protein T21L8.170 - Arabictopsis th. . . 



BAA1388S . 1 f D89224) similar to Saccharomyces cerevisia. . . 



AAD46810 . i { AF157643 4 (AF157643) putative transporter pr. 



NP^ 005486 .l| -Na/P04 cotransporter >gi 1 4587207 | dbj |BAA76. . . 



BAA95074 . 1 \ (AB041591) unnamed protein product [Mus raus. 



P3 9 3 9 8 | YJ JL ECOIiI HYPOTHETICAL 49.4 KD PROTEIN IN TSR-MD. . 



AAD49570 . 1 



AAFS5770 . 1 



AAD49S71.1 



AAD49S72.1 



AAC1387B.1 



AF1350371 (AF135037) nitrate transporter [Cy. 



(AE003730) CG4288 gene product [alt 1] {Dros.. 
(AF135038) nitrate transporter [Cylindrothec . - 
AF135'039_l (AF135039) nitrate transporter [Cy. . 



(U39735) high molecular weight basic nuclear. 



P314S7 DGOTECOI/E D -GALACTONATE TRANSPORTER 



pir | ID65171 hypothetical 48.8 kD protein in ibpA-gyrB inter 

plr| |T15201 hypothetical protein F12B6.2 - Caenorhabditis e. . . 
gb | AAF97770 . 1 j AF24457S_l (AF244578) membrane glycoprotein S... 



NP 013104.1} Ylr004cp >gi (2132659 |pir j[ S64826 probable ... _^0 



218 


* vS DO 


192- 








188 






/Id A Q 


131 


-ie-43 






lip 


2e-30 


llfi. 


2e-29 


116 


2e-29 


116 


6e-29 


114 


2e-28 


107 


6e-27 




/e-^i / 


ioi 


3e-24 




2e~23 


i°s. 


2e*-22 


88 












80 


8e-20 


90 


2e-19 


91 


4e-17 


91 




• 74 


le-16 


87 


9e-16 


71 


3e-15 


83 


*le-14 


81 


4e-14 


78 


5e-13 


67 




69 


2 e -X0 


68 


4e-10 


63 


le-08 


61 


4e-08 


52 


9e-08 


55 




51 




50 


"A-05 


50 


"ffk-04 




2e-04 


48 


4e-04 


_47 


6e-04 


47 


6e-04 


47 


8e-04 


46 


0.002 


46 


0-002 


45 


0.002 


' 44 


0.005- 


„ 43 


0.016 


43 


0.016 


-42 


0.028 


41 


0.062 
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re f |NP ^ 011S79,i) H+-biotin symporter,- Vhtlp >gi (1723674 1 spf . . 
spl09Z7N9|UHPT CHLPN PROBABLE HEXOSB PHOSPHATE TRANSPORT PR, . 
gb|AAD25i66,ijAF030337_4 (AF030397) putative 4 -chlorobenzoa mm 
emb|CAB885S0,l[ . (AL353819) related to carboxylic acid trans.. 

(AF042490) 4-chlorobenzoate transport protei.. 
AE004461 7 (AEQ04461) 4-hydroxybenzoate trans.. 



ab 1AAF78822.1 



3k 



AAG03 624,1 



pir| |S14183 DNA-directed RNA polymerase (EC 2.7-7.6) larges! 
gbjAAB59301.il (L20696) meiotin-1 [Liliuro longiflorum] 



pir 



pir 



reif 



pir 



emb 



S14182 



DNA- directed RNA polymerase (EC 2.7.7.6) larges... 
DNA-directed RNA polymerase (EC 2.7.7.6) larges... 
^,,037488^1]. monocarboxylate transporter 3 >gij 6093322 [.. . 



S14181 



gb 



sp 



.1T07796 DMA-directed RNA polymerase (EC 2.7.7.6) larges. . . 
GAA36734.1] (X52493) DNA~ directed RNA polymerase [Glyci... 

(AE003839) CG8791 gene product [Drosophila m. . . 
(AF130226) NADH dehydrogenase subunit P [Que... 



AAF59171.2 



AAF08182.1 



par 



pir 



sp 



3k 



3k 



3k 



sp 



3k 



3k 



£28722 |NPT1 RABIT RENAL SODIUM- DEPENDENT PHOSPHATE TRANS 
F75S80 probable sugar transporter - Deinococcus radiod 
S56583 hypothetical protein £2 61b - Escherichia coli >. . 



001636 | LMBV CHICK LAMININ BETA- 1 CHAIN .VARIANT (LAMININ ! \ ' ~36 
AAF49470.2 ' " " ~ " ~ *~ ~ * T % " ~ ' - ~ - .... _ ^ — 



AAD00528.1 



AAF53319.1 



AAF44801.1 



AAF67524 . 1 



AAF97769-1 



(AE003527) CG4877 gene product [Drosophila m! . ] 
_ (U82223) titin [Sus scrofa] 

. (AE003640) BG:DS07660.1 gene product [Drosop. 
AE003406 6 (AE003416) hypothetical protein [D. 
AF?04396 1 (AF204396) monocarboxylate transpo. 



Q62795[NPT1_RAT RENAL SODIUM-OEPENDENT PHOSPHATE TRANS Po! ! I 35 



. , » (AF244577) membrane glycoprotein HP59 [Homo * 

ref fop 03656*6.1 ( solute carrier family 17 (anion/ sugar tran. 
.ftAC15775.l| (AF061335) oxy tetracycline exporter [Strepto, 



pir[ |JE0378 DNA <cytosine-5-) -me thyltransf erase (EC 2.1.1.3. . . 
pir | [F69443 octaprenyl -diphosphate synthase (ispB) homolog . . . 
QblAAD04032.ll (AF079900) tetracycline efflux protein [Stre,.. 
emb|CAB76602.ll (AJ271264) glycerol kinase [Staphylococcus . . . 



sp 



3k 



3k 



P77416 I HYFP ECOLI HYDROGENASE-4 COMPONENT D >gi I 7466377 I . 
AAF63452.1 — — - — * ~ ' 



AAB28462.1 



emb 



emb 



. emb 



emb 



emb 



ref 



pir 



AF218267 9 (AF218267) benzoate transport prot... 34 



CAB67228.1 



CAB67217.1 



CAB76600.1 



CAB76601-1 



extensin=nodule- specific proline -rich protei, 
(AJ271079) NdhF* protein [Oenothera elata s... 
(AJ271079} NADH-plastoquinone oxidoreductas. . . 

(AJ271262) glycerol kinase [Staphylococcus 

(AJ2 71263) glycerol kinase [Staphylococcus 

(AiT271271) glycerol kinase [Staphylococcus 

_ Probable multidrug resistance protein; Ybr. . . 

«^JT27092 hypothetical protein Y51B9A.6 - Caenorhabditis ... 
sp I Q51955 I PCAK PSEPU 4-HYDROXYBEN20ATE TRANSPORTER >gifll47*.. 
pir | |T34995 probable integral membrane efflux protein - Str*.. 
qbjAAC69842.l| (AF.076683) unknown [Staphylococcus aureus] 



CAB76609.1 



NP 009852.1) 



emb 



emb 



CAB76606-1 



CAB76608.1 



(AJ271268) glycerol kinase [Staphylococcus 
(AJ271270) glycerol kinase [Staphylococcus 
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SEQUENCE LISTING 

<110> Syngenta Participations AG 
5 Cornell Research Foundation, Inc. 

Yoder, Olen 
Turgeon, Barbara G . 
Lu, Shen-wen 

10<120> Fungal Iron Reductase Gene 

<130> 1360.017WO1 

<150> US 60/252,732 
15<151> 2000-11-22 

<150> US 60/252 , 649 
<151> 2000-11-22 



20<160> 210 



<170> FastSEQ for Windows Version 4.0 



25<210> 1 
<211> 9 
<212> PRT 

<213> Artificial Sequence 



30<220> 

<223> Motif 

<221> SITE 
<222> 3 
35<223> Xaa = 

<221> SITE 

<222> 4, 6, 

<223> Xaa = 

40 

<400> 1 



Glu or Asp 
8 

any amino acid 



WO 02/42444 
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2 

Val Leu Xaa Xaa Gly Xaa Gly Xaa Gly 
1 5 

<210> 2 
5<211> 6550 
<212> DNA 

<213> Cochliobolus heterostrophus 



<400> 2 



"1 r\ 4"" /^f r~i t~\ +■» <T j~e *-\ 

± u Lgcctgcgcc 


tgtgcttgtg 


cccgtggaac 


gucgcggccc 


gc tycugca u 


m /— r /^i r~i 4~ 4-» /^t 4— 

agectatctg 


to u 


tacatacaac 


accat cccat 


cccgcttcac 


/—% 4-* /—i 1 4-» 4-> /—% 

ctycccuycc 


4— i— 1 y— 1 4— /*-* y—t 4— /— t /— f 

ucccLCCLCy 


4— /— T /—I — > /*1 y— i — \ 4— 

tyccacacat 




ccgccyccca. 


caacaccatg 


gc tgegacca 


a.ccccga.gcc 


geaggecaaa 


y— 1 y— r j— t y^r y— f — \ y— f y~t 

CLycayyayc 


J_ C5 U 


4— y— cy-w' /~1 /^i ^ y— | /— t" 

uy gaccacy a. 


gcucgaggag 


/^f /*i *3 4 — 4— 4— ^» 

yycyataLua 


^ ^ ^ ^ r~ff~t 

CaCaadaayy 


ytccgtacuy 


/—l 4— t^f y~1 q /—l /— i r> /— 1 /—l 

ctgcaccacc 


A ft U 


a r< (*i /~i — \ 4— /-i /-i 


f~» 4— 1 4— f~\ 4— 

gccLCtCLyC 


guy cyctaat 


cagtcycaua 


rtr* 4— ^ 4— i— f 2% a a o 

yv_ L. a Lydddci 


a <~< r~r4- t~i(~^r* r* r* 

ac_y LtyoaLL 


*5 n n 
O u u 


1 /T 4— r~r r* 4— <*T/~< 4- <T+" 

xoyL-ycL-yt-uyL- 




agggee ty ac 




—j /— <■ 4— 4— i^f /-i —j /— r f* 

ciy tuy v^cLy y c 


(~\ f~x -ri <— i /— < 4— rra a #"i 

uyaL.L. L-yaau 


O O \J 




caccccaacc 


a uccag L-gag 


y— f <~f 4— i— 1 4— r 4— 

ggc uc uege u 


cccgcaccgc 


4— /— 1 /— < 4~ 4— 4— /— r /— 1 4— 

aucctLtyct 


ft z u 


attccgt ccg 


gtccgagt cc 


ateaeggega 


ccacaacccc 


cacatat cca 


gc t cccccgc 


a q n 


cccgac teat 


accat gaege 


ttccgcacag 


ggccaattgg 


gcgcacccat 


gecatatgeg 


Oft u 


aacgcct ccg 


ccgcfcgcctc 


ggggggctcg 


cagtacatgg 


catacccgcc 


cagecaagt c 


to U U 


z uggccguuti tc 


aagagaagca 


gctgggcctg 


cgtacaaatt 


cgctccagcg 


caat tec tea 


to to u 


c age tgtege 


aaggaagega 


gaegt teat t 


ccacggcct c 


aaacgee tga 


atacaaccac 


Ton 


t cgcgcgagc 


cc accat gat 


gggcaactac 


gect tcaatc 


cagacaat ca 


gcaaagttat 


n q n 
/ o U 


ga uyy ccaa u 


u egge u. c L. c c 


y 9 9 agagg c c 


agL.cgaagga. 


gcacc a ugct 


cgaggt-aaac 


RAO 
O ft U 


cagggttatt 


tttccgactt 


cacaggccag 


cagatgeaag 


acaat cgega 


ctcgtatggg 


q n A 


2 5ggacccaacc 


gctactcgtc 


gggagatgee 


ttttctccta 


ccgccgcgat 


t ccacct ccc 


y to u 


atgatgaacc 


ccaacgatct 


ccccttgggc 


getgetgaaa 


ccatgatgcc 


gctagagccc 


1020 


cgcgatctgc 


ettttgaegt 


ttacgaccct 


cacaacccca 


atgtcaaaat 


gtcaaagttt 


1080 


gacaacattg 


gcgctgtctt 


gcgtcaccga 


agtcgcacac 


agecaaggae 


gactgccttc 


1140 


tgggtccttg 


aegcaaaagg 


caaagagacg 


gcgtccatca 


cctgggaaaa 


ggtggctagt 


1200 


3 0cgcgcggaaa 


aggtggccaa 


agtgattcgg 


gacaagagca 


acctctatcg 


aggegacegt 


1260 


gtggcattag 


tgtacaggga 


tacagaaatc 


attgattttg 


tcgtggcgtt 


gatgggctgc 


132*0 


ttcattgegg 


gcgttgtagc 


ggtacccatc 


aatagegteg 


acgactacca 


gaaactcatt 


1380 


cttctcctaa 


cgacaactca 


agctcatctc 


gcattgacca 


cagacaacaa 


tctcaaggcc 


1440 


tttcatcgtg 


acattagtca 


gaaccgtctg 


aaatggccga 


gtggggtaga 


gtggtggaag 


1500 


3 5acgaacgagt 


ttggcagcca 


ccaccccaag 


aaacatgacg 


atactccagc 


tttgcaagta 


1560 


ccagaggttg 


cctatattga 


gttctcgcgt 


gcacctactg 


gtgaccttcg 


cggtgtggtg 


1620 


cttagtcacc 


ggactattat 


gcaccaaatg 


gcctgcatca 


gtgecatgat 


tagcacgata 


1680 


cccaccaacg 


ctcagagcca 


agacaegtte 


agcactagcc 


tacgggatgc 


agagggaaag 


1740 


ttcgttgctc 


cagcaccgtc 


cagaaacccc 


acagaagtga 


tcctcacgta 


cctcgacccg 


1800 


4 0cgcgaaagcg 


ctggtctcat 


tctcagtgtc 


ttgtttgcag 


tttatggagg 


ccacaccacc 


1860 
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gtatggctcg 


agacagcgac 


catggaaacc 


ccgggtctat 


atgcacatct 


catcaccaaa 


1920 


tacaagtcca 


acatactgct 


agcggattac 


ccaggcctca 


agcgcgctgc 


atacaactac 


1980 


caacaggatc 


caatggctac 


aagaaacttc 


aagaaaaaca 


cagaacccaa 


cttcgcctcc 


2040 


gtgaagatct 


gtctgattga 


cacgcttacc 


gtcgactgtg 


aatttcacga 


aattctcgga 


2100 


Sgatcgatatt 


tcaggccact 


gcgaaaccct 


agagcgcgag 


aactgatcgc 


gccaatgctc 


2160 


tgcttgccag 


aacatggtgg 


aatgataata 


tctgtacgcg 


actggctagg 


tggagaggag 


2220 


cgcatgggct 


gcccgctaag 


catagcagta 


gaagagtcag 


ataatgatga 


agatgataca 


2280 


gaggataagt 


atgcagcggc 


aaatggctac 


tccagtctta 


ttggtggtgg 


cactacaaag 


2340 


aacaaaaagg 


agaagaagaa 


gaaaggcccg 


acagagctta 


cagaaatctt 


gctggacaag 


2400 


lOgaagctctga 


agatgaacga 


agtcattgtt 


ctggccattg 


gagaagaagc 


aagcaagcgg 


2460 


gcaaacgagc 


ccggcaccat 


gcgagtcggt 


gcctttggat 


accccatacc 


ggatgcgaca 


2520 


ctagctattg 


tagaccctga 


gacaagtctt 


ctatgttcac 


catactcgat 


aggcgagatc 


2580 


tgggtagatt 


cgccttcact 


ctctggtggc 


ttctggcagc 


tgcagaagca 


tacagagacc 


2640 


attttccatg 


ctcgaccata 


ccgtttcgtt 


gagggtagcc 


ctacgccaca 


gttgcttgaa 


2700 


ISctcgagtttc 


tgcgtactgg 


actcctcggc 


tttgttgtag 


agggaaaaat 


atttgtcctt 


2760 


ggactgtacg 


aagatcgcat 


cagacagcgt 


gttgaatggg 


tagaaaatgg 


tcagcttgaa 


2820 


gccgagcatc 


gatacttttt 


tgtgcagcac 


ctggtcacaa 


gcattatgaa 


ggccgtgcca 


2880 


aaaatttacg 


actggtaagt 


gagctgccaa 


cagagcaagg 


actgtctaac 


gtgtcatagc 


2940 


tcgtcgtttg 


attcttatgt 


aaatggtgaa 


tacctgccaa 


tcattctcat 


cgagacgcag 


3 000 


2 0gccgcatcga 


ctgcgcccac 


aaacccaggt 


ggaccaccac 


aacaattgga 


tataccattt 


3060 


ttggattcac 


tatctgagag 


gtgcatggag 


gtcctttacc 


aagagcatca 


tttacgggta 


3120 


tactgcgtga 


tgattacagc 


acctaataca 


cttccacgag 


tcatcaagaa 


cggacggcga 


3180 


gaaattggca 


atatgctgtg 


taggagagag 


tttgacaatg 


gctctctgcc 


ctgtgtacac 


3240 


gtaaagtttg 


gcattgagcg 


atcagtgcag 


aacattgcgc 


tcggtgacga 


tcccgctggc 


3300 


2 5ggcatgtggt 


catttgaggc 


atcaatggca 


cgtcagcaat 


tcttgatgct 


ccaagacaag 


3360 


caatactctg 


gtgtcgatca 


tcgcgaagtc 


gtcattgacg 


acaggacatc 


gactccactc 


3420 


aatcagttct 


cgaatatcca 


cgacctgatg 


caatggcgtg 


tatctcggca 


ggccgaggaa 


3480 


cttgcttact 


gcactgtcga 


cggtcgagga 


aaagagggca 


aaggcgtcaa 


ttggaagaag 


3540 


tttgatcaaa 


aggttgcggg 


cgtagcaatg 


tacctcaaga 


acaaggtcaa 


ggtccaggcc 


3600 


3 0ggcgatcatc 


tccttctgat 


gtacacgcat 


tcagaagaat 


ttgtttatgc 


tgttcatgca 


3660 


tgttttgtgc 


ttggagctgt 


ttgcatacca 


atggcgccaa 


ttgatcagaa 


ccggttgaat 


372 0 


gaggatgcgc 


cggccttgct 


gcatatcctt 


gcagatttca 


aggtcaaagc 


cattcttgtc 


3780 


aacgctgacg 


ttgaccatct 


gatgaagatc 


aagcaagtat 


cgcagcacat 


caaacaatcg 


3840 


gccgctatcc 


tcaagatcag 


tgtgccaaac 


acatacagca 


caacaaagcc 


gccaaagcaa 


3900 


35tccagtggct 


gccgcgacct 


caagcttaca 


attcgaccgg 


catggattca 


ggcgggtttc 


3960 


ccagtgctag 


tctggacata 


ctggacgccc 


gatcaacgtc 


gtatcgcagt 


tcagctgggc 


4020 


catagccaaa 


tcatggcact 


gtgcaaggtc 


caaaaagaaa 


catgccaaat 


gacaagtaca 


4080 


cgaccagtcc 


ttggttgtgt 


ccggagcacg 


ataggacttg 


gtttccttca 


cacttgtctc 


4140 


atgggaatct 


tccttgccgc 


acccacatac 


ctggtgtcac 


ctgttgactt 


tgcacaaaac 


4200 


4 0cctaatattc 


tgttccaaac 


gctttcgcgg 


tacaagatca 


aggatgcata 


tgcaacgagt 


4260 
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<210> 3 
<211> 1743 
<212> PRT 
5<213> Cochliobolus heterostrophus 



<400> 3 

Met Leu Glu Val 
1 

lOMet Gin Asp Asn 
20 

Gly Asp Ala Phe 
35 

Pro Asn Asp Leu 
15 50 

Pro Arg Asp Leu 
65 

Lys Met Ser Lys 

2 0 Arg Thr Gin Pro 

100 

Lys Glu Thr Ala 
115 

Lys Val Ala Lys 
25 130 

Arg Val Ala Leu 
145 

Ala Leu Met Gly 

3 0Ser Val Asp Asp 

180 

Ala His Leu Ala 
195 

Asp lie Ser Gin 
35 210 

Lys Thr Asn Glu 
225 

Pro Ala Leu Gin 

4 0 Pro Thr Gly Asp 



Asn Gin Gly Tyr 
5 

Arg Asp Ser Tyr 

Ser Pro Thr Ala 
40 

Pro Leu Gly Ala 
55 

Pro Phe Asp Val 
70 

Phe Asp Asn He 
85 

Arg Thr Thr Ala 

Ser He Thr Trp 
120 

Val He Arg Asp 
135 

Val Tyr Arg Asp 
150 

Cys Phe He Ala 
165 

Tyr Gin Lys Leu 

Leu Thr Thr Asp 
200 

Asn Arg Leu Lys 
215 

Phe Gly Ser His 
230 

Val Pro Glu Val 
245 

Leu Arg Gly Val 



Phe Ser Asp Phe 
10 

Gly Gly Pro Asn 
25 

Ala He Pro Pro 

Ala Glu Thr Met 
60 

Tyr Asp Pro His 
75 

Gly Ala Val Leu 
90 

Phe Trp Val Leu 
105 

Glu Lys Val Ala 

Lys Ser Asn Leu 
140 

Thr Glu He He 
155 

Gly Val Val Ala 
170 

He Leu Leu Leu 
185 

Asn Asn Leu Lys 

Trp Pro Ser Gly 
220 

His Pro Lys Lys 
235 

Ala Tyr He Glu 
250 

Val Leu Ser His 



Thr Gly Gin Gin 
15 

Arg Tyr Ser Ser 
30 

Pro Met Met Asn 
45 

Met Pro Leu Glu 

Asn Pro Asn Val 
80 

Arg His Arg Ser 
95 

Asp Ala Lys Gly 
110 

Ser Arg Ala Glu 
125 

Tyr Arg Gly Asp 

Asp Phe Val Val 
160 

Val Pro He Asn 
175 

Thr Thr Thr Gin 
190 

Ala Phe His Arg 
205 

Val Glu Trp Trp 

His Asp Asp Thr 
240 

Phe Ser Arg Ala 
255 

Arg Thr He Met 
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His Gin 

Ala Gin 
5 290 
Lys Phe 
305 

Thr Tyr 

10 Phe Ala 

Met Glu 

Asn lie 
15 370 
Tyr Gin 
385 

Pro Asn 

2 0Asp Cys 

Arg Asn 

Glu His 
25 450 
Glu Arg 
465 

Asp Glu 

3 0Ser Leu 

Lys Gly 

Lys Met 
35 530 
Arg Ala 
545 

lie Pro 

4 0Cys Ser 



260 

Met Ala Cys lie 
275 

Ser Gin Asp Thr 

Val Ala Pro Ala 
310 

Leu Asp Pro Arg 
325 

Val Tyr Gly Gly 
340 

Thr Pro Gly Leu 
355 

Leu Leu Ala Asp 

Gin Asp Pro Met 
390 

Phe Ala Ser Val 
405 

Glu Phe His Glu 
420 

Pro Arg Ala Arg 
435 

Gly Gly Met lie 

Met Gly Cys Pro 
470 

Asp Asp Thr Glu 
485 

lie Gly Gly Gly 
500 

Pro Thr Glu Leu 
515 

Asn Glu Val lie 

Asn Glu Pro Gly 
550 

Asp Ala Thr Leu 
565 

Pro Tyr Ser lie 



6 

265 

Ser Ala Met lie Ser 
280 

Phe Ser Thr Ser Leu 
295 

Pro Ser Arg Asn Pro 
315 

Glu Ser Ala Gly Leu 
330 

His Thr Thr Val Trp 
345 

Tyr Ala His Leu lie 
360 

Tyr Pro Gly Leu Lys 
375 

Ala Thr Arg Asn Phe 
395 

Lys lie Cys Leu lie 
410 

lie Leu Gly Asp Arg 
425 

Glu Leu lie Ala Pro 
440 

lie Ser Val Arg Asp 
455 

Leu Ser lie Ala Val 
475 

Asp Lys Tyr Ala Ala 
490 

Thr Thr Lys Asn Lys 
505 

Thr Glu He Leu Leu 
520 

Val Leu Ala He Gly 
535 

Thr Met Arg Val Gly 
555 

Ala He Val Asp Pro 
570 

Gly Glu He Trp Val 



270 

Thr He Pro Thr Asn 
285 

Arg Asp Ala Glu Gly 
300 

Thr Glu Val He Leu 
320 

He Leu Ser Val Leu 
335 

Leu Glu Thr Ala Thr 
350 

Thr Lys Tyr Lys Ser 
365 

Arg Ala Ala Tyr Asn 
380 

Lys Lys Asn Thr Glu 
400 

Asp Thr Leu Thr Val' 
415 

Tyr Phe Arg Pro Leu 
430 

Met Leu Cys Leu Pro 
445 

Trp Leu Gly Gly Glu 
460 

Glu Glu Ser Asp Asn 
480 

Ala Asn Gly Tyr Ser 
495 

Lys Glu Lys Lys Lys 
510 

Asp Lys Glu Ala Leu 
525 

Glu Glu Ala Ser Lys 
540 

Ala Phe Gly Tyr Pro 
560 

Glu Thr Ser Leu Leu 
575 

Asp Ser Pro Ser Leu 
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580 

Ser Gly Gly Phe 
595 

Ala Arg Pro Tyr 
5 610 
Glu Leu Glu Phe 
625 

Lys He Phe Val 

lOGlu Trp Val Glu 
660 

Val Gin His Leu 
675 

Asp Cys Ser Ser 
15 690 

He Leu He Glu 
705 

Gly Pro Pro Gin 

2 0 Arg Cys Met Glu 

740 

Val Met He Thr 
755 

Arg Arg Glu He 
25 770 

Ser Leu Pro Cys 
785 

Asn He Ala Leu 

3 0Ala Ser Met Ala 

820 

Ser Gly Val Asp 
835 

Pro Leu Asn Gin 
35 850 

Ser Arg Gin Ala 
865 

Lys Glu Gly Lys 

4 0Gly Val Ala Met 



Trp Gin Leu Gin 
600 

Arg Phe Val Glu 
615 

Leu Arg Thr Gly 
630 

Leu Gly Leu Tyr 
645 

Asn Gly Gin Leu 

Val Thr Ser He 
680 

Phe Asp Ser Tyr 
695 

Thr Gin Ala Ala 
710 

Gin Leu Asp He 
725 

Val Leu Tyr Gin 

Ala Pro Asn Thr 
760 

Gly Asn Met Leu 
775 

Val His Val Lys 
790 

Gly Asp Asp Pro 
805 

Arg Gin Gin Phe 

His Arg Glu Val 
840 

Phe Ser Asn He 
855 

Glu Glu Leu Ala 
870 

Gly Val Asn Trp 
885 

Tyr Leu Lys Asn 
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585 

Lys His Thr Glu 

Gly Ser Pro Thr 
620 

Leu Leu Gly Phe 
635 

Glu Asp Arg He 
650 

Glu Ala Glu His 
665 

Met Lys Ala Val 

Val Asn Gly Glu 
700 

Ser Thr Ala Pro 
715 

Pro Phe Leu Asp 
730 

Glu His His Leu 
745 

Leu Pro Arg Val 

Cys Arg Arg Glu 
780 

Phe Gly He Glu 
795 

Ala Gly Gly Met 
810 

Leu Met Leu Gin 
825 

Val He Asp Asp 

His Asp Leu Met 
860 

Tyr , Cys Thr Val 
875 

Lys Lys Phe Asp 
890 

Lys Val Lys Val 



590 

Thr He Phe His 
605 

Pro Gin Leu Leu 

Val Val Glu Gly 
64 0 

Arg Gin Arg Val 
655 

Arg Tyr Phe Phe 
670 

Pro Lys He Tyr 
685 

Tyr Leu Pro He 

Thr Asn Pro Gly 
720 

Ser Leu Ser Glu 
735 

Arg Val Tyr Cys 
750 

He Lys Asn Gly 
765 

Phe Asp Asn Gly 

Arg Ser Val Gin 
800 

Trp Ser Phe Glu 
815 

Asp Lys Gin Tyr 
830 

Arg Thr Ser Thr 
845 

Gin Trp Arg Val 

Asp Gly Arg Gly 
880 

Gin Lys Val Ala 
895 

Gin Ala Gly Asp 
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900 905 910 

His Leu Leu Leu Met Tyr Thr His Ser Glu Glu Phe Val Tyr Ala Val 

915 920 925 

His Ala Cys Phe Val Leu Gly Ala Val Cys lie Pro Met Ala Pro lie 
5 930 935 940 

Asp Gin Asn Arg Leu Asn Glu Asp Ala Pro Ala Leu Leu His lie Leu 
945 950 955 960 

Ala Asp Phe Lys Val Lys Ala lie Leu Val Asn Ala Asp Val Asp His 
965 970 975 

lOLeu Met Lys lie Lys Gin Val Ser Gin His He Lys Gin Ser Ala Ala 
980 985 990 

He Leu Lys He Ser Val Pro Asn Thr Tyr Ser Thr Thr Lys Pro Pro 

995 1000 1005 

Lys Gin Ser Ser Gly Cys Arg Asp Leu Lys Leu Thr He Arg Pro Ala 
15 1010 1015 1020 

Trp He Gin Ala Gly Phe Pro Val Leu Val Trp Thr Tyr Trp Thr Pro 
1025 1030 1035 1040 

Asp Gin Arg Arg He Ala Val Gin Leu Gly His Ser Gin He Met Ala 
1045 1050 1055 

2 0Leu Cys Lys Val Gin Lys Glu Thr Cys Gin Met Thr Ser Thr Arg Pro 

1060 1065 1070 

Val Leu Gly Cys Val Arg Ser Thr He Gly Leu Gly Phe Leu His Thr 

1075 1080 1085 

Cys Leu Met Gly He Phe Leu Ala Ala Pro Thr Tyr Leu Val Ser Pro 
25 1090 1095 1100 

Val Asp Phe Ala Gin Asn Pro Asn He Leu Phe Gin Thr Leu Ser Arg 
1105 1110 1115 1120 

Tyr Lys He Lys Asp Ala Tyr Ala Thr Ser Gin Met Leu Asp His Ala 
1125 1130 1135 

3 0Ile Ala Arg Gly Ala Gly Lys Ser Met Ala Leu His Glu Leu Lys Asn 

1140 1145 1150 

Leu Met He Ala Thr Asp Gly Arg Pro Arg Val Asp Val Tyr Gin Arg 

1155 1160 1165 

Val Arg Val His Phe Ala Pro Ala Asn Leu Asp Pro Thr Ala He Asn 
35 1170 1175 1180 

Thr Val Tyr Ser His Val Leu Asn Pro Met Val Ala Ser Arg Ser Tyr 
1185 1190 1195 1200 

Met Cys He Glu Pro Val Glu Leu His Leu Asp Val His Ala Leu Arg 
1205 1210 1215 

40 Arg Gly Leu Val Met Pro Val Asp Pro Asp Thr Glu Pro Asn Ala Leu 
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1220 1225 1230 

Leu Val Gin Asp Ser Gly Met Val Pro Val Ser Thr Gin lie Ser lie 

1235 1240 1245 

Val Asn Pro Glu Thr Asn Gin Leu Cys Leu Asn Gly Glu Tyr Gly Glu 
5 1250 1255 1260 

lie Trp Val Gin Ser Glu Ala Asn Ala Tyr Ser Phe Tyr Met Ser Lys 
1265 1270 1275 1280 

Glu Arg Leu Asp Ala Glu Arg Phe Asn Gly Arg Thr lie Asp Gly Asp 
1285 1290 1295 

lOPro Asn Val Arg Tyr Val Arg Thr Gly Asp Leu Gly Phe Leu His Ser 
1300 1305 1310 

Val Thr Arg Pro He Gly Pro Asn Gly Ala Pro Val Asp Met Gin Val 

1315 1320 1325 

Leu Phe Val Leu Gly Ser He Gly Asp Thr Phe Glu Val Asn Gly Leu 
15 1330 1335 1340 

Asn His Phe Ser Met Asp He Glu Gin Ser Val Glu Arg Cys His Arg 
1345 1350 1355 1360 

Asn He Val Pro Gly Gly Cys Ala Val Phe Gin Ala Gly Gly Leu Val 
1365 1370 1375 

2 OVal Val Val Val Glu He Phe Arg Arg Asn Phe Leu Ala Ser Met Val 

1380 1385 1390 

Pro Val He Val Asn Ala He Leu Asn Glu His Gin Leu Val He Asp 

1395 1400 1405 

He Val Ser Phe Val Gin Lys Gly Asp Phe His Arg Ser Arg Leu Gly 
25 1410 1415 1420 

Glu Lys Gin Arg Gly Lys He Leu Ala Gly Trp Val Thr Arg Lys Met 
1425 1430 1435 1440 

Arg Thr He Ala Gin Tyr Ser He Arg Asp Pro Asn Gly Gin Asp Ser 
1445 1450 1455 

3 0Gln Met He Thr Glu Glu Pro Gly Pro Arg Ala Ser Met Thr Gly Ser 

1460 1465 1470 

Met Leu Gly Arg Met Gly Gly Pro Ala Ser He Lys Ala Gly Ser Thr 

1475 1480 1485 

Arg Ala Pro Ser Leu Met Gly Met Thr Ala Thr Met Asn Asn Leu Ser 
35 1490 1495 1500 

Leu Thr Gin Gin Gin Gin Gin Gin Tyr Gin Gin Pro Gly Met Tyr Ala 
1505 1510 1515 1520 

Gin Gin Gin Gly Met His Pro Gin Gin Gin His Gin Phe Ser Met Ser 
1525 1530 1535 

4 0 Asn Thr Pro Pro Gin Gly Pro Pro Gin Gly Val Glu Leu His Asp Pro 
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1540 1545 1550 

Ser Asp Arg Thr Pro Thr Asp Asn Arg His Ser Phe Leu Ala Asp Pro 

1555 1560 1565 

Arg Met Gin Asn Gin Gly Gin Met Asn Glu Thr Gly Ala Tyr Glu Pro 
5 1570 1575 1580 

Met Asn Tyr Gin Asn Ala Tyr His Pro His Gin Gin Gin Tyr Glu Ser 
1585 1590 1595 1600 

Glu Asp Gly Gly Ser Arg Leu Ser Gly Pro Val Pro Asp Val Leu Arg 
1605 1610 1615 

lOPro Gly Pro Ser Ser Gly Ser lie Glu Gin His Asp Gin Ala Asn Asn 
1620 1625 1630 

Asp Asn Asn Met Trp Asn Asn Arg Glu Tyr Tyr Gly Asn Ser Pro Ser 

1635 1640 1645 

Tyr Ala Gly Gly Tyr Thr Gin Asp Gly Asn He His Glu Gin Gin Gin 
15 1650 1655 1660 

His Asp Glu Tyr Thr Ser Asn Ala Ser Tyr Gly Gly Asn Gin Gly Ala 
1665 1670 1675 1680 

Gly Gly Gly Ser Gly Gly Gly Gly Gly Leu Arg Val Ala Asn Arg Asp 
1685 1690 1695 

2 0Ser Ser Asp Ser Glu Gly Ala Asp Asp Asp Ala Trp Arg Arg Asp Ala 

1700 1705 1710 

Leu Ala Gin He Asn Phe Ala Gly Gly Ala Ala Ala Ala Ser Ala Gly 

1715 1720 1725 

Ala Pro Ala Ala Gly Ala Ser Ser Ser Gin Pro Gly His Ala Gin 
25 1730 1735 1740 

<210> 4 
<211> 23 
<212> DNA 

3 0<213> Artificial Sequence 

<220> 

<223> Primer 

35<400> 4 

gcggataaca atttcacaca gga 



<210> 5 
<211> 20 
40<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Primer 

5 

<400> 5 

aggcccagct gcttctcttg 2 0 

<210> 6 
10<211> 24 
<212> DNA 

<213> Artificial Sequence 



<220> 
15<223> Primer 



<400> 6 

actcggaccg gaacggaata acaa 24 

20<210> 7 
<211> 18 
<212> DNA 

<213> Artificial Sequence 



25<220> 

<223> Primer 



<400> 7 

cggaaggagt gcgaacaa 18 

<210> 8 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 



<400> 8 
4 0gctgcttgca tctggtcttg 



20 
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<210> 9 
<211> 21 
<212> DNA 

<213> Artificial Sequence 

5 

<220> 

<223> Primer 

<400> 9 
lOagacccagct gttgcccatt g 

<210> 10 

<211> 20 

<212> DNA 

15<213> Artificial Sequence 

<220> 

<223> Primer 

20<400> 10 

cggagacgca aagcctgaga 

<210> 11 
<211> 20 
25<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Primer 

30 

<400> 11 

tgccagctgc gtccaagaag 

<210> 12 
35<211> 19 
<212> DNA 

<213> Artificial Sequence 



<220> 
40<223> Primer 
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13 



<400> 12 



gctagcatgg ccctcacac 



19 



<210> 13 
5<211> 21 
<212> DNA 

<213> Artificial Sequence 

<220> 
10<223> Primer 

<400> 13 

tgtgttgacc tccactagct c 21 

15<210> 14 
<211> 22 
<212> DNA 

<213> Artificial Sequence 

20<220> 

<223> Primer 

<400> 14 

ctacgggatg cagagggaaa gt 22 

25 

<210> 15 
<211> 21 
<212> DNA 

<213> Artificial Sequence 

30 

<220> 

<223> Primer 



<400> 



15 



35gccatgatta gcacgatacc c 



21 



<210> 16 



<211> 21 



<212> DNA 



40<213> Artificial Sequence 
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14 



<220> 



<223> Primer 



<400> 16 



5cgccgtgcat acaactacca a 



21 



<210> 17 
<211> 20 
<212> DNA 
10<213> Artificial Sequence 

<220> 

<223> Primer 
15<400> 17 

tggtggcact acaaagaaca 2 0 

<210> 18 
<211> 21 
20<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Primer 

25 

<400> 18 

cagcgtcttg aatgggtaga a 21 

<210> 19 
30<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 
35<223> Primer 



<400> 19 



ctgggtagat tcgccttcac 



20 



40<210> 20 
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15 

<211> 21 
<212> DNA 

<213> Artificial Sequence 

5<220> 
<223> Primer 

<400> 20 

gagcgatcag tgcagaacat t 

10 

<210> 21 
<211> 21 
<212> DNA 

<213> Artificial Sequence 

15 

<220> 

<223> Primer 

<400> 21 
2 0cgctgacgtt tgaccatctg a 

<210> 22 
<211> 19 
<212> DNA 
25<213> Artificial Sequence 

<220> 

<223> Primer 

30<400> 22 

gcatatgcaa cgagtcaaa 

<210> 23 
<211> 18 
35<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Primer 

40 
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16 

<400> 23 

acggtgcacc tgttgata 18 

<210> 24 
5<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 
10<223> Primer . 

<400> 24 

atgcgcacaa tagcccagta 2 0 

15<210> 25 
<211> 21 
<212> DNA 

<213> Artificial Sequence ' 

20<220> 

<223> Primer 

<400> 25 

ttcaagcaac tgtggcgtag g 21 

25 

<210> 26 
<211> 23 
<212> DNA 

<213> Artificial Sequence 

30 

<220> 

<223> Primer 
<400> 26 

35gatcctagcg accgcacacc aac 23 

<210> 27 
<211> 18 
<212> DNA 
40<213> Artificial Sequence 
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17 



<220> 



<223> Primer 



<400> 27 



5cctgctgctg gtgcttct 



18 



<210> 28 
<211> 20 
<212> DNA 
10<213> Artificial Sequence 

<220> 

<223> Primer 
15<400> 28 

gagttgcaaa tcgtgacagc 2 0 

<210> 29 
<211> 24 
20<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Primer 

25 

<400> 29 

tatcagctgt tgttcaatgt tcta 24 

<210> 30 
30<211> 19 
<212> DNA 

<213> Artificial Sequence 

<220> 
35<223> Primer 



<400> 30 



tgttatccca ttgccattg 



19 



40<210> 31 



WO 02/42444 

<211> 20 
<212> DNA 

<213> Artificial Sequence 

5<220> 
<223> Primer 

<400> 31 

aaggacggag attggtggag 

10 

<210> 32 
<211> 17 
<212> DNA 

<213> Artificial Sequence 

15 

<220> 

<223> Primer 

<400> 32 
2 0ggagatggcg gtgacga 

<210> 33 
<211> 18 
<212> DNA 
25<213> Artificial Sequence 

<220> 

<22 3> Primer 

30<400> 33 

gcatggcttg tggaggac 

<210> 34 
<211> 24 
35<212> DNA 

<213> Artificial Sequence 
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18 



20 



17 



18 



<220> 
<223> 

40 



Primer 
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19 



<400> 34 



agattgtggc tagtatggag gtaa 



24 



<210> 35 
5<211> 17 
<212> DNA 

<213> Artificial Sequence 

<220> 
10<223> Primer 

<400> 35 

gttttcccag tcacgac 17 

15<210> 36 
<211> 24 
<212> DNA 

<213> Artificial Sequence 

20<220> 

<223> Primer 

<400> 36 

tactactagc ataccagcat acct 24 

25 

<210> 37 
<211> 21 
<212> DNA 

<213> Artificial Sequence 

30 

<220> 

<223> Primer 



<400> 37 



35tcaacctcgg aataccaagt c 



21 



<210> 38 



<211> 9 



<212> DNA 



40<213> Artificial Sequence 
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20 

<220> 

<223> Sequence around ATG of ORF1 
<400> 38 

Bcaccatgct 9 

<210> 39 
<211> 9 
<212> DNA 
10<213> Artificial Sequence 

<220> 

<223> Fungal consensus 
15<400> 39 

caccatggc 9 

<210> 40 
<211> 6003 
20<212> DNA 

<213> Fusarium gr amine a rum 

<400> 40 

ctcgaggtta gtaaaagatc cccgtttgtt ccacaaatct ccatctccct ctcaatgcct 60 

25ttcttggcgc ctcaacccgc tattttgaag acagtttgtt gttgtcgcat gcgaccaaaa 12 0 

atcatcctct caagttttca tcgctgacct gtttcttggc gtaggaagga gatatcacac 18 0 

agaaagggta agctgctttg cgtccagagt acttacaatt gcttctcaat tacttacgcg 24 0 

ccggcagcta ccaaaagcga cgaactcaac ttttctccca attcctcggt gcacctccac 3 00 

ctcagattgc tgctctcgcc gagcctcagt ctggcctacg catacactcg cccgatgact 3 60 

3 0ccgaccaccc ttcaggcgat ggccatcgcg ctaccgccta tgccgctctc ggtagcagca 42 0 

gcggtccaat cccagattca ccagactcac ctatgtaccg accgcactct ggttatgctc 4 80 

cttcagaatc accaagacct tctccagcac aacctccacc ttccctgctg cgcccggggg 540 

gttctctcgc tggaggatcg accactgctc accgcgactc cctcttcttc tccccctccc 600 

atctcgaacc tgaaacccgg acaggtacta tgatgtcggg cgactatgca ttcagacccg 660 

3 5agcagcaagg cacatatggc gaatcccagc atcaacagca ccagttccag caacagcaac 72 0 

agccacagca gcaacagcag tacgatgggc agcagtatga tggacgaact acaacgcttc 78 0 

tcgattcgca aggatacttt tcggattttg cgggacagca gcactatgat cagactcaaa 840 

ccgttgagta tgtgggacct cagcagcggt attcttccag cgatgcattc tctccaaccg 900 

ccgcaatggc acctccaatg cttacaacca acgacctccc accgccggaa gcgcttgagt 960 

4 0accagctgcc ccttgaccct cgcgaggtac cattcgctat tcaagatccc catgatgatt 102 0 
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agagtacege 


tactt ctttg 


ttcagcacat 


cgttgtgagc 


attgtcaaga 




acgttccaaa 


gatatacgat 


tgtt cagect 


ttgaegtett 


tgt caatgac 


yaacaccLyc 


O Q D A 

*d 0 O u 


cagu.cgL.ggu 


gcuggaguca 


gcagcugcgc 


caacyycacc 


ra t" 4— <t a na+" 4— 

aLLyacaLCL 


ft ft ra ft ft ra /-» i — < 4— ^-1 

gy aggaccuc 


9 qzl n 


ctcgacaacc 


ggatacagct 


cugc uagagc 


cauuggcuga 


gegcugcaug 


/"r ra ftfti— 4" /^i 4*" /—1 ra 

gay yLLCLca 


onnn 

jUUU 


ug ucayayca 


ucaucugaga 


*«-h /—f 4— *— i /— i 4™ 1 i— t /— * i^»*r 

cug tacuycy 


4~* 4~* 4** i— r 4 — ~\ r^i 

uuaugaucac 


age acc cgac 


•3 r-i 4~ 4-' 4— rr 4— /'-I 
aCLLLgCLLC 




35gagttgttaa 


gaaeggaega 


cgcgaaattg 


gtaacatget 


ttgeegtegg 


gagtttgatc 


3120 


tcggcaacct 


tccatgtgtg 


caegtcaagt 


ttggcgtgga 


geatgeagta 


cttaacctcc 


3180 


ctattggtgt 


agaccctata 


ggtggtatct 


ggtcaccgtt 


ggegtccgat 


tctcgtgccg 


3240 


aattcttatt 


gccagctgac 


aagcaatact 


ctggtgtcga 


caggegegaa 


gtcgttatcg 


3300 


atgacegtae 


ttcaacgccc 


ctaaacaatt 


tetcttgeat 


tteggatett 


atccaatggc 


3360 


4 Ogcgtggcccg 


tcaaccagaa 


gagctagegt 


actgeacaat 


cgatggcaaa 


agecgagaag 


3420 
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QtaaQQQtQt 

ZD Z3 ZDZD ZD 


aacatggaag 


aaattcgaca 


ccaaggtcgc 


ttccgttgcc 


atgfcacctga 


3480 


agaacaaggt 


caaqqtqaqq 


ccqqqaqacc 


acatcatcct 


catgtacaca 


cattcagagg 


3540 


agtttgtctt 


tgccatccat 


gcctgcattt 


CCttqqqcqc 

^ ^ ^ ^ ZD zD ZD ZD 


aattgtcatt 


cccatcgcac 


3600 


ccctcgacca 


gaaccgattg 


aacgaagatg 


tcccagctfct 


cctgcatatt 


gtatctgatt 


3660 


5acaacgtcaa 


ggctgtgctg 


gtcaacgctg 


aggtcgatca 


tctaatcaag 


qtaaaqcctq 


3720 


tggctagcca 


tatcaaacag 


tcagcccagg 


ttctcaagat 


cacgagccct 


gccatctaca 


3780 


acacaactaa 


qccqccaaaq 


caaagtagtg 


qattqaqqqa 


tttgagattc 


accattgacc 


3 840 


ctgcctggat 


tcggcctggc 


taccccgtca 


ttgtttggac 


ttattggacc 


cccgatcaac 


3900 


gacgaatttc 

—> ZD »- w 


agttcagctt 


ggacatgaca 


ccattatggg 


catgtgcaag 


gttcaaaagg 


3960 


lOaaacttgcca 


aatqacaagt 


tcaagacctg 


tqcttqqatq 

ZZi iZJ ~J —) 


tgtacgaagc 


acgactggcc 


4020 


taggctttat 


tcatacggct 


ctgatgggaa 


tttatatcgg 


aacaccaacc 


tacctcctat 


4080 


cacctgtcga 


gtttgcagcc 


aaccccatgt 


ctctattcgt 


caccttgtcg 


agatacaaga 


4140 


ttaaggatac 


ttatgcgaca 


ccacagatgc 


ttgatcatgc 


catgaactcc 


atgcaggcca 


4200 


agggctttac 


acttcatgaa 


cttaagaaca 


tgatgatcac 


tqccqaqaqc 


cgaccaagag 


4260 


IBttgatgtttt 


ccaaaaggtc 


agacttcact 


ttqctqqqqc 

-3 ZD ZD ZD ZD 


tqqqctcqat 


agaactgcta 


4320 


ttaacacggt 


ctattcgcat 


gtcctcaacc 


ccatggtagc 


gtcgcgatct 


tatatgtgca 


4380 


tcgagcctat 


tgagctttgg 


ttggacacgc 


aagcgcttcg 


acqtqqtctq 


gttattcctg 


4440 


tggaccctga 


atcagatcct 


ctggccctac 


tggtacagga 


cagcggtatg 


gttccagttt 


4500 


caacccaaat 


agccatcatc 


aaccctgaaa 


gcagaataca 


ctgcctcgat 


qqtqaqtatq 

_3 ^3 Z3 ^^^3 ^3 


4560 


2 Ogtgaaatttg 


ggtcgactct 


qaaqcctqcq 


tcaagtcatt 


ctatggctcc 


aaagacgctt 


462 0 


ttgacgctga 


gcgcttfcgafc 


qqccqaqctc 

ZD ZD ^""ZD 


ttqacqqcqa 


tcccaacat t 


cagtafcatcc 


4680 


gtaccggaga 


cttgggtttc 


cttcataatg 


ttagtcgacc 


tattggccct 


aatggtgccc 


4740 


aggtggacat 


gcaagtgttg 


tttgttctcg 


gcaacattgg 


cgagactfctt 


gagatcaacg 


4800 


gattgagcca 


tttcccaatg 


gatattgaga 


actcggtgga 


aaaatgccac 


agaaacattg 


4860 


2 5 tggcgaatgg 


ctggtaagta 


taaaatctct 


atttgaagcg 


aatatgctaa 


caaagtcagt 


4920 


qcqqtqttcc 

ZD ZD ZD *~ZD 


aagctggtgg 


cttggtggtt 


gttctggttg 


aagtcaaccg 


caagccatac 


4980 


ctggcatcga 


ttgttcccgt 


cattgtcaac 


gctatcctca 


atgaacacca 


aatcattgta 


5040 


gafcatcgtcg 


cattcgtcaa 


caagggagac 


ttcccacggt 


ctcgtctagg 


agagaagcag 


5100 


cgtggcaaga 


ttcttqqfcqq 

w i—* ^» ' ZD ZD ZD 


ctgggttagt 


agaaagctga 


ggactcttgc 


ccagttctcg 


5160 


3 Oattcgcgata 


tggacgccga 


atccacagct 


ggtgatatga 


tggatccttc 


tagagcatca 


5220 


atggtcagcg 


tacgaagcgg 


a-ggcgg tgc t 


gctcccggat 


cttctagttfc 


gaggaahgtc 


5280 


gaacctgcgc 


ctcaaatctt 


qqaqqaqqaa 

ZD ZD ZD ZD ZD ZD 


catgaccaga 


tgactcctcg 


tcacgaatac 


5340 


gaagcagccc 


ctaccatgat 


ttctgaactt 


cccgacggcc 


aagagacacc 


gacagggttt 


5400 


cagcactcgc 


aatacgaaca 


cccaccacaa 


fccagccggtt 


ctcaagcacc 


agcccagctg 


5460 


o oaacccttccc 


accagcccga 


ucaaggaunc 


gatatggact 


4— 4— 4— — ^ ?-\ f-* 4** 

tttCaCyaUd 


uagu u cage a 


c c i n 
b b A U 


gagcccgatc 


acggccctgt 


ccacagacgt 


ccagtcccag 


gccaagccca 


acaacccgag 


5580 


cctatgcaag 


ggtacggtca 


agcgccgccc 


cagatccggc 


taccaggtgt 


tgatggacga 


5640 


gaggagggag 


ggttctggtc 


acagcaggaa 


aagaacgaga 


agagtgaaga 


agactggaca 


5700 


actgatgcca 


tgatgcatat 


gaatctggca 


ggtgatatga 


aaccgccacg 


atgataatac 


5760 


40acaacataag 


agcgaagtga 


cgaagcggag 


tcggagttgg 


gaagcattta 


gaaacgaata 


5820 
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acaaacaatt ggacttgtcg gtctgatggc ctatttactt cattcataga tgaggattgg 58 8 0 

atagtgaata tgtgattgga taaagcctgg gtttgtgagt ttgtgaatgc agtgggtgct 594 0 

tgctataagc tgttttattg aggtctttgg aggagtgtct aacaaagatg caaagttact 60 0 0 

agt 6003 

5 



<210> 41 
<211> 1692 
10<212> PRT 

<213> Fusarium graminearum 

<400> 41 

Met Met Ser Gly Asp Tyr Ala Phe Arg Pro Glu Gin Gin Gly Thr Tyr 
15 1 5 10 15 

Gly Glu Ser Gin His Gin Gin His Gin Phe Gin Gin Gin Gin Gin Pro 

20 25 30 

Gin Gin Gin Gin Gin Tyr Asp Gly Gin Gin Tyr Asp Gly Arg Thr Thr 
35 40 45 

2 0Thr Leu Leu Asp Ser Gin Gly Tyr Phe Ser Asp Phe Ala Gly Gin Gin 

50 55 60 

His Tyr Asp Gin Thr Gin Thr Val Glu Tyr Val Gly Pro Gin Gin Arg 
65 70 75 80 

Tyr Ser Ser Ser Asp Ala Phe Ser Pro Thr Ala Ala Met Ala Pro Pro 
25 85 90 95 

Met Leu Thr Thr Asn Asp Leu Pro Pro Pro Glu Ala Leu Glu Tyr Gin 

100 105 110 

Leu Pro Leu Asp Pro Arg Glu Val Pro Phe Ala He Gin Asp Pro His 
115 120 125 

3 0Asp Asp Ser Thr Pro Met Ser Lys Phe Asp Asn He Ala Ala Val Leu 

130 135 140 

Arg His Arg Gly Arg Thr He Ala Lys Lys Pro Ala Tyr Trp Val Leu 
145 150 155 160 

Asp Ser Lys Gly Lys Glu He Ala Ser He Thr Trp Asp Lys Leu Ala 
35 165 170 175 

Ser Arg Ala Glu Lys Val Ala Gin Val He Arg Asp Lys Ser Ser Leu 

180 185 190 

Tyr Arg Gly Asp Arg Val Ala Leu He Tyr Arg Asp Ser Glu Val He 
195 200 205 

40Asp Phe Ala He Ala Leu Leu Gly Cys Phe He Ala Gly Val Val Ala 
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210 

Val Pro He Asn 
225 

Thr Thr Thr Gin 

5 

Ala Phe Gin Arg 
260 

Val Glu Trp Trp 
275 

lOLys Glu Asp Val 
290 

Phe Ser Arg Ala 
305 

Arg Thr He Met 

15 

He Pro Gly Asn 
340 

Lys Asn Gly Arg 
355 

2 0 Ser Tyr Leu Asp 

370 

Leu Thr Val Tyr 
385 

Val Asp Val Pro 

25 

Thr He Met He 
420 

Tyr Gin Gin Glu 
435 

3 0Pro Asn Phe Gin 

450 

Asp Ser Gly Ser 
465 

Arg Asn Pro Arg 

35 

Glu His Gly Gly 
500 

Glu Arg Met Gly 
515 

4 0Ser Asp Glu Glu 



215 

Asp Leu Gin Asp 
230 

Ala His Leu Ala 
245 

Asp He Thr Thr 

Lys Thr Asn Glu 
280 

Pro Ala Leu Val 
295 

Pro Thr Gly Asp 
310 

His Gin Met Ala 
325 

Gly Pro Gly Asp 

Leu He Gly Gly 
360 

Pro Arg Gin Gly 
375 

Gly Gly His Thr 
390 

Gly Leu Tyr Ala 
405 

Ala Asp Tyr Pro 

Pro Met Val Thr 
440 

Met He Lys Leu 
455 

His Glu Val Leu 
470 

Ala Arg Glu Val 
485 

Met Val He Ser 

Cys Pro Leu Lys 
520 

Lys Glu Glu Thr 



24 

220 

Tyr Gin Arg Leu 
235 

Leu Thr Thr Asp 
250 

Gin Lys Leu Thr 
265 

Phe Gly Ser Tyr 

Val Pro Asp Leu 
300 

Leu Arg Gly Val 
315 

Cys Leu Ser Ala 
330 

Thr Phe Asn Pro 
345 

Gly Ala Ser Ser 

He Gly Met He 
380 

Thr Val Trp Phe 
395 

His Leu Leu Thr 
410 

Gly Leu Lys Arg 
425 

Arg Asn Phe Lys 

Cys Leu He Asp 
460 

Ala Asp Arg Trp 
475 

Val Ala Pro Met 
490 

Val Arg Asp Trp 
505 

Leu Glu Leu Gly 
Glu Lys Pro Ala 
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Asn His He Leu 
240 

Asn Asn Leu Lys 
255 

Trp Pro Lys Gly 
270 

His Pro Lys Lys 
285 

Ala Tyr He Glu 

Val Leu Ser His 
320 

He He Ser Thr 
335 

Ser Leu Arg Asp 
350 

Glu He Leu Val 
365 

Leu Ser Val Leu 

Asp Asn Lys Ala 
400 

Lys Tyr Lys Ser 
415 

Ala Ala Tyr Asn 
430 

Lys Gly Met Glu 
445 

Thr Leu Thr Val 

Leu Arg Pro Leu 
480 

Leu Cys Leu Pro 
495 

Leu Gly Gly Glu 
510 

Glu Asp Thr Glu 

'525 

Val Ser Asn Gly 
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530 535 540 

Phe Gly Ser Leu Leu Ser Gly Gly Gly Thr Ala Thr Thr Glu Glu Arg 
545 550 555 560 

Ala Lys Asn Glu Leu Gly Glu Val Leu Leu Asp Arg Glu Ala Leu Lys 
5 565 570 575 

Thr Asn Glu Val Val Val Val Ala lie Gly Asn Asp Ala Arg Lys Arg 

580 585 590 

Val Thr Asp Asp Pro Gly Leu Val Arg Val Gly Ser Phe Gly Tyr Pro 
595 600 605 

lOIle Pro Asp Ala Thr Leu Ser Val Val Asp Pro Glu Thr Gly Leu Leu 
610 615 620 

Ala Ser Pro His Ser Val Gly Glu lie Trp Val Asp Ser Pro Ser Leu 
625 630 635 640 

Ser Gly Gly Phe Trp Ala Gin Pro Lys Asn Thr Glu Leu lie Phe His 
15 645 650 655 

Ala Arg Pro Tyr Lys Phe Asp Pro Gly Asp Pro Thr Pro Gin Pro Val 

660 665 670 

Glu Pro Glu Phe Leu Arg Thr Gly Leu Leu Gly Thr Val lie Glu Gly 
675 680 685 

2 0Lys lie Phe Val Leu Gly Leu Tyr Glu Asp Arg He Arg Gin Lys Val 

690 695 700 

Glu Trp Val Glu His Gly His Glu Leu Ala Glu Tyr Arg Tyr Phe Phe 
705 710 715 720 

Val Gin His He Val Val Ser He Val Lys Asn Val Pro Lys lie Tyr 
25 725 730 735 

Asp Cys Ser Ala Phe Asp Val Phe Val Asn Asp Glu His Leu Pro Val 

740 745 750 

Val Val Leu Glu Ser Ala Ala Ala Ser Thr Ala Pro Leu Thr Ser Gly 
755 760 765 

3 0Gly Pro Pro Arg Gin Pro Asp Thr Ala Leu Leu Glu Ser Leu Ala Glu 

770 775 780 

Arg Cys Met Glu Val Leu Met Ser Glu His His Leu Arg Leu Tyr Cys 
785 790 795 800 

Val Met He Thr Ala Pro Asp Thr Leu Pro Arg Val Val Lys Asn Gly 
35 805 810 815 

Arg Arg Glu He Gly Asn Met Leu Cys Arg Arg Glu Phe Asp Leu Gly 

820 825 830 

Asn Leu Pro Cys Val His Val Lys Phe Gly Val Glu His Ala Val Leu 
835 840 845 

4 0 Asn Leu Pro He Gly Val Asp Pro He Gly Gly He Trp Ser Pro Leu 
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850 855 860 

Ala Ser Asp Ser Arg Ala Glu Phe Leu Leu Pro Ala Asp Lys Gin Tyr 
865 870 875 880 

Ser Gly Val Asp Arg Arg Glu Val Val lie Asp Asp Arg Thr Ser Thr 
5 885 890 895 

Pro Leu Asn Asn Phe Ser Cys lie Ser Asp Leu lie Gin Trp Arg Val 

900 905 910 

Ala Arg Gin Pro Glu Glu Leu Ala Tyr Cys Thr lie Asp Gly Lys Ser 
915 920 925 

lOArg Glu Gly Lys Gly Val Thr Trp Lys Lys Phe Asp Thr Lys Val Ala 
930 935 940 

Ser Val Ala Met Tyr Leu Lys Asn Lys Val Lys Val Arg Pro Gly Asp 
945 950 955 960 

His lie lie Leu Met Tyr Thr His Ser Glu Glu Phe Val Phe Ala lie 
15 965 970 975 

His Ala Cys lie Ser Leu Gly Ala He Val He Pro He Ala Pro Leu 

980 985 990 

Asp Gin Asn Arg Leu Asn Glu Asp Val Pro Ala Phe Leu His He Val 
995 1000 1005 

2 0 Ser Asp Tyr Asn Val Lys Ala Val Leu Val Asn Ala Glu Val Asp His 

1010 1015 1020 

Leu He Lys Val Lys Pro Val Ala Ser His He Lys Gin Ser Ala Gin 
1025 1030 1035 1040 

Val Leu Lys He Thr Ser Pro Ala He Tyr Asn Thr Thr Lys Pro Pro 
25 1045 1050 1055 

Lys Gin Ser Ser Gly Leu Arg Asp Leu Arg Phe Thr He Asp Pro Ala 

1060. 1065 1070 

Trp He Arg Pro Gly Tyr Pro Val He Val Trp Thr Tyr Trp Thr Pro 
1075 1080 1085 

3 0Asp Gin Arg Arg He Ser Val Gin Leu Gly His Asp Thr He Met Gly 

1090 1095 1100 

Met Cys Lys Val Gin Lys Glu Thr Cys Gin Met Thr Ser Ser Arg Pro 
1105 1110 1115 1120 

Val Leu Gly Cys Val Arg Ser Thr Thr Gly Leu Gly Phe He His Thr 
35 1125 1130 1135 

Ala Leu Met Gly He Tyr He Gly Thr Pro Thr Tyr Leu Leu Ser Pro 

1140 1145 1150 

Val Glu Phe Ala Ala Asn Pro Met Ser Leu Phe Val Thr Leu Ser Arg 
1155 1160 1165 

4 0Tyr Lys He Lys Asp Thr Tyr Ala Thr Pro Gin Met Leu Asp His Ala 
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1170 1175 1180 

Met Asn Ser Met Gin Ala Lys Gly Phe Thr Leu His Glu Leu Lys Asn 
1185 1190 1195 1200 

Met Met He Thr Ala Glu Ser Arg Pro Arg Val Asp Val Phe Gin Lys 
5 1205 1210 1215 

Val Arg Leu His Phe Ala Gly Ala Gly Leu Asp Arg Thr Ala He Asn 

1220 1225 1230 

Thr Val Tyr Ser His Val Leu Asn Pro Met Val Ala Ser Arg Ser Tyr 
1235 1240 1245 

lOMet Cys lie Glu Pro He Glu Leu Trp Leu Asp Thr Gin Ala Leu Arg 
1250 1255 1260 

Arg Gly Leu Val He Pro Val Asp Pro Glu Ser Asp Pro Leu Ala Leu 
1265 1270 1275 1280 

Leu Val Gin Asp Ser Gly Met Val Pro Val Ser Thr Gin He Ala lie 
15 1285 1290 1295 

He Asn Pro Glu Ser Arg He His Cys Leu Asp Gly Glu Tyr Gly Glu 

1300 1305 1310 

He Trp Val Asp Ser Glu Ala Cys Val Lys Ser Phe Tyr Gly Ser Lys 
1315 1320 1325 

20Asp Ala Phe Asp Ala Glu Arg Phe Asp Gly Arg Ala Leu Asp Gly Asp 
1330 1335 1340 

Pro Asn He Gin Tyr He Arg Thr Gly Asp Leu Gly Phe Leu His Asn 
1345 1350 1355 1360 

Val Ser Arg Pro He Gly Pro Asn Gly Ala Gin Val Asp Met Gin Val 
25 1365 1370 1375 

Leu Phe Val Leu Gly Asn He Gly Glu Thr Phe Glu He Asn Gly Leu 

1380 1385 1390 

Ser His Phe Pro Met Asp He Glu Asn Ser Val Glu Lys Cys His Arg 
1395 1400 1405 

3 0 Asn He Val Ala Asn Gly Cys Ala Val Phe Gin Ala Gly Gly Leu Val 
1410 1415 1420 

Val Val Leu Val Glu Val Asn Arg Lys Pro Tyr Leu Ala Ser He Val 
1425 1430 1435 1440 

Pro Val He Val Asn Ala He Leu Asn Glu His Gin He He Val Asp 
35 1445 1450 1455 

He Val Ala Phe Val Asn Lys Gly Asp Phe Pro Arg Ser Arg Leu Gly 

1460 1465 1470 

Glu Lys Gin Arg Gly Lys He Leu Gly Gly Trp Val Ser Arg Lys Leu 
1475 1480 1485 

40Arg Thr Leu Ala Gin Phe Ser He Arg Asp Met Asp Ala Glu Ser Thr 
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1490 1495 1500 

Ala Gly Asp Met Met Asp Pro Ser Arg Ala Ser Met Val Ser Val Arg 
1505 1510 1515 1520 

Ser Gly Gly Gly Ala Ala Pro Gly Ser Ser Ser Leu Arg Asn Val Glu 
5 1525 1530 1535 

Pro Ala Pro Gin lie Leu Glu Glu Glu His Asp Gin Met Thr Pro Arg 

1540 1545 1550 

His Glu Tyr Glu Ala Ala Pro Thr Met lie Ser Glu Leu Pro Asp Gly 
1555 1560 1565 

lOGln Glu Thr Pro Thr Gly Phe Gin His Ser Gin Tyr Glu His Pro Pro 
1570 1575 1580 

Gin Ser Ala Gly Ser Gin Ala Pro Ala Gin Leu Asn Leu Ser His Gin 
1585 1590 1595 1600 

Pro Asp Gin Gly Phe Asp Met Asp Phe Ser Arg Tyr Ser Ser Ala Glu 
15 1605 1610 1615 

Pro Asp His Gly Pro Val His Arg Arg Pro Val Pro Gly Gin Ala Gin 

1620 1625 1630 

Gin Pro Glu Pro Met Gin Gly Tyr Gly Gin Ala Pro Pro Gin lie Arg 
1635 1640 1645 

2 0Leu Pro Gly Val Asp Gly Arg Glu Glu Gly Gly Phe Trp Ser Gin Gin 
1650 1655 1660 

Glu Lys Asn Glu Lys Ser Glu Glu Asp Trp Thr Thr Asp Ala Met Met 
1665 1670 1675 1680 

His Met Asn Leu Ala Gly Asp Met Lys Pro Pro Arg 
25 1685 1690 

<210> 42 
<211> 2369 
<212> DttA 
30<213> Alternaria solani 



<400> 42 



aagaagaaag 


ggccgaccga 


gttgaccgaa 


atattgctag 


ataaggaagc 


actgaagctg 


60 


aacgaagttg 


ttgttttggc 


cattggagag 


gaagtgagca 


agcgtgtcaa 


cgaacccggc 


120 


35actatgagag 


tcggtgcttt 


tggctacccg 


ataccagatg 


cgacgctggc 


cgtcgtcgat 


180 


ccggaaacta 


atcttttgtg 


ttcaccctat 


tccataggag 


agatctgggt 


agactcgcca 


240 


tcattgtccg 


gagggttttg gcagctgcag 


aagcacactg 


agactatttt 


ccacgctcgg 


300 


ccatatcgtt 


tcgtagaggg 


cagcccaacc 


ccgcaactac 


tcgaactgga 


gtttctacgc 


360 


actggactgc 


tcggatgcgt 


ggtagaaggc 


aaaatcttcg 


tattaggcct 


gtacgaggac 


420 


4 0cggattaggc 


agcgcgttga 


atgggtagag 


cacggtcagc 


tagaagccga 


acataggtat 


480 
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gt caactgga 


agaagtt cga 


ccagaaggt c 


i o a n 


gcaggtgtcg 


ccatgtacct 


gaagaacaag 


gt caagggtc 


agactggtga 


/"i ^ jt 4— /^r 4- s~^ 

ccaccugcuc 


T O £T A 


LLyaLy LaCa 


cccacLCyga 


agacLLLgLC 


tatgeegtae 


"*\ st ft st 4** i^r 4*" 4-" 4"* 

acgcgugcuL. 


ft ft 4~ y-t 4~ 4-^ /~*f-*r 23 

cguccuugga 




loyCLy uy tyLa 


/^i !Q -h- **** /«r 

LacccaLyyc 


accaatcgac 


cagaacaggc 


4^* n n "3 4"" ST ST ^ 

taaaCgaaya 


ft ft ft ft ft ft r~t i-v ft 23 

cgcgcccgca 


IjoU 


CLaCLaCaLa 


4— — ■ i 4— 4— /y r^l 4"~ r^r ^ 

cca LCycuya 


CLLcaaggLC 


aaggc ua tec 


4*~ S~i ST 4"*" /*1 Q 4~ ST ST 

LcgLcaaLgc 


uggeguagae 


1 A A A 


/T «<-> /—I 4— —i /->f *-\ 

CoccLyaLga 


agg Luaayca 


ayLaLcgcag 


/^i *rs 4 W r^i rzs r?s 

cacaccaaac 


agucagcagL 


catLCLcaay 


IjUU 


af*n2 art /~» 4- 23 /~r 
aLLaaLytaL 


ft ft -a ^\ +- /^t 4— 

cyaaLaccLa 


Ldacdccaca 


•^•ZST^ftft*Tlftffi— o 


dgodgLL- cdy 


4— ftfft— 4~ i^r/^i ft ft ft 

uyyuuyccyc 




y^r ^ 4-» 4— 4— ft ft 

yaLCLLaayC 


LcacaaLacy 


ac c ug c c ugg 


^-s 4— <a <^ ^\ 4>" /"i 4~ 

acacaaLCLg 


4^ 4-. 4— /^i /«j /^i 4— 4-. 
yLLLCCCLgt 


4— ft 4- -2j ft 4/^ 23 4— ft ft 

UGusyuaugy 


1 <ron 
ID ^ VJ 


O A *2i t\ 4— 4*. /^r/"< 23 

ZUaCaCaCtgya 


ft ^\ S~i ft 4/* ft 

caccuyacca 


gagacgcata 


gctgtgcaat 


4 — ^ f~t fX 4^ 23 4^~ 23 

cay gc cat. ay 


ft ft 23 23 23 4 — ft ^ 4~ ' ft 

ccaaatLaLg 


iron 
IDOU 


gcgctatgca 


a.a.gt t cagaa 


agaaacgtgc 


cagatgacga 


gcacacggcc 


cgt ccttgga 


T *7 A A 


ugugu Ley ua 


cjcacgatcgg 


tcttggcttc 


atacacacct 


/-w- +- i-^T 4^ 4~" STST/T 

grgccacggg 


4— 4™* rf^n 4— 4^ /^i r^i 4" /^i 

uauccuccuc 


-t Q A A 


yCaycgccaa 


CLLaCCLCyL 


gtcacctgtc 


gattttgege 


aaaacccgaa 


ft >a +- nn+- /^4— 4- ft 

GcLUCCUCUUC 


lODU 


ft 23 ftT\ ft ft "~\ 4~ r*f 4— 

cayaccatyL 


cyagaLacaa 


gatcaaggac 


gegtatgega 


<>~i /^i ^» ft ft ft a >a n 4" 

ccayccaaaL 


ft ft 4— ft ft 23 /""l z 1 "! 23 ft 

gcLggaccac 




O CT /**r rt 4"- 4— 4" , — 1 ^ #— • 

zoyCLauLgfcac 


gaggtgcrgg 


caagaacatg gctctgcacg 


ayCLcaagaa 


ft ft 4"~ ra 4™* *a 4^ /~ 1 

ccucaugauc 




gcgacugacg 


gt cgyccgcg 


cgtagacgtc 


tgtaagtgtt 


gcgaucccgL. 


n-% 4~" r3 q r^c /*i o 4~ /**t 4— 

auaagcaucu 


O A A A 


gaaatctaat 


tcttgataga 


ccagcgtgtg 


cgagtacact 


tctcgccagc 


aagtttggac 


2100 


cgaacggcaa 


tcaatactgt 


ttactcacac 


gtactgaatc 


etatggtege 


ategeggtea 


2160 


tacatgtgca 


tcgaacccat 


agaactacat 


ctcgatgtcg 


gtgcccttcg 


aagaggtctc 


2220 


3 Oatcatgcctg 


tcgacccaga 


cacggaacct 


ggtgctctct 


tagtccagga 


ctegggtatg 


2280 


gtaccagtta 


gtacacaaat 


ttcaatcgtg 


aatccagaga 


caaaccagct 


ttgcctagtc 


2340 


ggcgagtatg 


gcgaaatctg 


ggtccaacc 








2369 



<210> 43 
35<211> 758 
<212> PRT 

<213> Alternaria solani 
<400> 43 

4 0Lys Lys Lys Gly Pro Thr Glu Leu Thr Glu lie Leu Leu Asp Lys Glu 
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1 



5 



10 



15 



Ala Leu Lys Leu Asn Glu Val Val Val Leu Ala lie Gly Glu Glu Val 

20 25 30 

Ser Lys Arg Val Asn Glu Pro Gly Thr Met Arg Val Gly Ala Phe Gly 
i 35 40 45 

Tyr Pro He Pro Asp Ala Thr Leu Ala Val Val Asp Pro Glu Thr Asn 

50 55 60 

Leu Leu Cys Ser Pro Tyr Ser He Gly Glu lie Trp Val Asp Ser Pro 



lOSer Leu Ser Gly Gly Phe Trp Gin Leu Gin Lys His Thr Glu Thr He 

85 90 95 

Phe His Ala Arg Pro Tyr Arg Phe Val Glu Gly Ser Pro Thr Pro Gin 

100 105 110 

Leu Leu Glu Leu Glu Phe Leu Arg Thr Gly Leu Leu Gly Cys Val Val 
15 115 120 125 

Glu Gly Lys He Phe Val Leu Gly Leu Tyr Glu Asp Arg He Arg Gin 

130 135 140 

Arg Val Glu Trp Val Glu His Gly Gin Leu Glu Ala Glu His Arg Tyr 
145 150 , 155 160 

2 0Phe Phe Val Gin His Leu Val Thr Ser He Met Lys Ala Val Pro Lys 

165 170 175 

He Tyr Asp Cys Ser Ser Phe Asp Ser Tyr Val Asn Gly Glu Tyr Leu 

180 185 190 

Pro He He Leu He Glu Thr Gin Ala Ala Ser Thr Ala Pro Thr Asn 
25 195 200 205 

Pro Gly Gly Pro Pro Gin Gin Leu Asp He Pro Phe Leu Asp Ser Leu 

210 215 220 

Ser Glu Arg Cys Met Glu Val Leu Tyr Gin Glu His His Leu Arg Val 
225 230 235 240 

3 0Tyr Cys Val Met He Thr Ala Pro Asn Thr Leu Pro Arg Val He Lys 

245 250 255 

Asn Gly Arg Arg Glu He Gly Asn Met Leu Cys Arg Arg Glu Phe Asp 

260 265 270 

Asn Gly Ser Leu Pro Cys Val His. Val Lys Phe Gly Val Glu Arg Ser 
35 275 280 285 

Val Gin Asn He Ala Leu Gly Asp Asp Pro Ala Gly Gly Met Trp Ser 

290 295 300 

Tyr Glu Ala Ser Met Ala Arg Gin Gin Phe Leu Met Leu Gin Asp Lys 
305 310 315 320 

4 0Gln Tyr Ser Gly Val Asp His Arg Glu Val Val He Asp Asp Arg Thr 



65 



70 



75 



80 
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325 330 335 

Ser Thr Pro Leu Asn Gin Phe Ser Asn lie His Asp Leu Met Gin Trp 

340 345 350 

Arg Val Gin Arg Gin Ala Glu Glu Leu Ala Tyr Cys Thr Val Asp Gly 
5 355 360 365 

Arg Gly Lys Glu Gly Lys Gly Val Asn Trp Lys Lys Phe Asp Gin Lys 

370 375 380 

Val Ala Gly Val Ala Met Tyr Leu Lys Asn Lys Val Lys Gly Gin Thr 
385 390 395 400 

10 Gly Asp His Leu Leu Leu Met Tyr Thr His Ser Glu Asp Phe Val Tyr 

405 410 415 

Ala Val His Ala Cys Phe Val Leu Gly Ala Val Cys lie Pro Met Ala 

420 425 430 

Pro lie Asp Gin Asn Arg Leu Asn Glu Asp Ala Pro Ala Leu Leu His 
15 435 440 445 

lie lie Ala Asp Phe Lys Val Lys Ala lie Leu Val Asn Ala Gly Val 

450 455 460 

Asp His Leu Met Lys Val Lys Gin Val Ser Gin His He Lys Gin Ser 
465 470 475 480 

2 0Ala Val He Leu Lys He Asn Val Pro Asn Thr Tyr Asn Thr Thr Lys 

485 490 495 

Pro Pro Lys Gin Ser Ser Gly Cys Arg Asp Leu Lys Leu Thr He Arg 

500 505 510 

Pro Ala Trp He Gin Ser Gly Phe Pro Val Leu Val Trp Thr Tyr Trp 
25 515 520 525 

Thr Pro Asp Gin Arg Arg He Ala Val Gin Leu Gly His Ser Gin He 

530 535 540 

Met Ala Leu Cys Lys Val Gin Lys Glu Thr Cys Gin Met Thr Ser Thr 
545 550 555 560 

3 0Arg Pro Val Leu Gly Cys Val Arg Ser Thr He Gly Leu Gly Phe He 

565 570 575 

His Thr Cys Val Met Gly He Phe Leu Ala Ala Pro Thr Tyr Leu Val 

580 585 590 

Ser Pro Val Asp Phe Ala Gin Asn Pro Asn He Leu Phe Gin Thr Met 
35 595 600 605 

Ser Arg Tyr Lys He Lys Asp Ala Tyr Ala Thr Ser Gin Met Leu Asp 

610 615 620 

His Ala He Ala Arg Gly Ala Gly Lys Asn Met Ala Leu His Glu Leu 
625 630 635 640 

40Lys Asn Leu Met He Ala Thr Asp Gly Arg Pro Arg Val Asp Val Tyr 
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645 650 655 

Gin Arg Val Arg Val His Phe Ser Pro Ala Ser Leu Asp Arg Thr Ala 

660 665 670 

lie Asn Thr Val Tyr Ser His Val Leu Asn Pro Met Val Ala Ser Arg 
5 675 680 685 

Ser Tyr Met Cys He Glu Pro He Glu Leu His Leu Asp Val Gly Ala 

690 695 700 

Leu Arg Arg Gly Leu He Met Pro Val Asp Pro Asp Thr Glu Pro Gly 
705 710 715 720 

lOAla Leu Leu Val Gin Asp Ser Gly Met Val Pro Val Ser Thr Gin He 

725 730 735 

Ser He Val Asn Pro Glu Thr Asn Gin Leu Cys Leu Val Gly Glu Tyr 

740 745 750 

Gly Glu He Trp Val Gin 



<210> 44 
<211> 2320 
<212> DNA 

2 0<213> Pyrenophora teres 

<400> 44 

aaaaagaagg ggcctacgga gttgaccgag atattgctag ataaggaagc gctcaagatg 6 0 

aacgatgttg tggtccttgc aataggagaa gaggccagta aacgtgcgaa tgagcctggc 12 0 

25acaatgcgag ttggcgcttt tggataccca ataccagatg cgacgctagc cgtcgtagat 18 0 

ccagagacga atctcttgtg ttcaccctac tcgataggag agatttgggt agactcacct 240 

tcattgtctg gtggtttctg gcaattgcag aagcacactg aaactatatt tcacgcccgc 3 00 

ccataccgct ttgtggaggg cagtcctacc ccgcagttgc ttgagcttga gtttctccgg 3 60 

acaggcttac tcggattcgt cgtagagggc aaggtcttta tccttggtct ctatgaagat 42 0 

3 0cgcatcaggc agcgcgttga atgggtagaa catggtcagc tggaagctga acacagatac 480 

ttcttcgtgc agcacctcgt caccagtatc atgaaggctg ttcccaagat ctacgactgg 54 0 

taagtcttct catgttttag atgagcgttc taacactatg cagctcatct ttcgactcgt 600 

acgtcaatgg cgaatacctg cctatcatcc tcatcgagac acaggctgca tcgacagccc 660 

ctacgaaccc tggtggaccg ccacagcaac tcgacatccc cttcctagac tcactgtctg 72 0 

35agcgatgcat ggaagtgttg tatcaagaac accatctgcg agtatactgc gtcatgatca 780 

cagcgccaaa cacattacca cgagttgtta agaatggtcg acgagaaatt ggcaacatgc 840 

tctgtcgaag agaatttgat aatggctcat taccttgtgt ccacgtcaag tttggtgttg 900 

agaggtcagt tctcaacatc gcgttgggtg atgacccctc cggaggcatg tggtcatatg 96 0 

aagcctcgat ggcgcgtcag cagttcttga tgctccaaga caagcagtat tctggagtag 102 0 

40atcaccgcga agtcgtcatg gatgacagaa catcgacacc tctcaaccaa ttctccaaca 1080 
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"H +■ 23 /"■• <t ts <"» 4— 


CatyCaatgy 


eg cytaucac 


ggcaggctga 


agagctcgea 


tattgeacag 


1 1 A f\ 




aggcaaagaa 


ggcaagggcg 


tcaactggaa 


gaagttcgac 


cagaaagttg 


"1 O A A 


cgggtgtcgc 


aatgtacctg 


aagaacaagg 


teaaagtgea 


aaceggegat 


catctgcttc 




t ga t g 1 a. t a c 


gcac tcggaa 


gac t ttgtat 


atgeggtaca 


tgcatgcttt 


gtgcttggcg 


13 2 0 


4— (f-Y+- "^3 4— r~x d 4— 

jCtgUatyCaL 


accaaugy ca 


ccaaucgacc 


agaac cgau l. 


gaatgaggat 


gcacctgcat 


"1 "3 Q A 


4— 4~~ /-r •ana+- 


/•■i /*i 4"" 4"" *ri /""t ^ in 

ccLLycagac 


ttcaaggeca 


aggccatccc 


cgucaacycc 


gatgtggafcc 


144 U 


atCtUaLgaa 


/"-TJT4— /~i -a -a /t/""! a m 

gytCaayCaa 


gtategcage 


acaccaaaca 


aLcaycagcc 


atctucaaga 


1 tr a a 
1DUU 


LCaaCy tyCC 


/-"T /~i m r~i 4— 4~~ *-\ i 

yCaCaCLLaC 


aacacaacca 


agccacctaa 


gcagtcgagt 


ggttgtcggg 


lo o U 




LaLaa LaL.yy 


/—i /^i 4— rr /-i /~-t 4— /-t/t/t 

LLuyuoLyyy 


LdCdyLcugg 


4— 4- +" nrirta i^ft— 4— 

Lttc tcay l. c 


cuLguatyga 


i (Ton 




4— /-t iri 2a /~to t*/ia 2 
LLLay a LLaa 


eg ^cg Lduag 


ccgtaoaacL 


aggtcauagc 


caaatcaugg 


lOOU 


LaCtayyCaa 


yytccayaag 


gayacutytc 


aaatgacaag 


tacaaggeca 


gt cctaggat 


i / 4 U 


gtgtacggag 


taccat cgga 


cttggc ttca 


t t catacc tg 


catcatgggc 


at c ttccttg 


T O A A 


ccgcacccac 


ttacctcgtg 


t cgcctgt eg 


ac tttgeaca 


aaatccaaac 


atactcttcc 


"1 O £T A 


-"v /-<r ^ /~l /-r4— t - ■*! +" /-t 

ayacy ttatc 


aayauacaay 


atcaagaauy 


cgtacgcaac 


cagecaaaug 


Luggaucacg 


-1 QO A 


ijctaLugcccy 


cggggccgga 


aayaacatgg 


4^ /^r /"i ^ /*n /^r "3 

ccctgcacga 


actcaagaaL 


ctcaugattg 


1 Q Q A 


cgactgatgg 


taggccgcgt 


gttgatgttt 


accagagagt 


gcgcgtacac 


ttttcaccag 


2040 


caagcttgga 


ccggacagcg 


attaacacag 


tctactctca 


cgtgctcaac 


ccaatggtag 


2100 


catcgcgatc 


atacatgtgc 


ategagecaa 


tagaactgea 


tctcgacgtc 


aacgctcttc 


2160 


gaagaggtct 


gatcatgccc 


gtcgacccag 


ataccgagcc 


tggegctcta 


atggtccagg 


2220 


2 0actctggtat 


ggtgccagtc 


tccacacaaa 


tagcaattgt 


gaacccagag 


acaaaccagc 


2280 


tttgcttggt 


tggcgaatat 


ggegaaatet 


gggttcaatc 






2320 



<210> 45 
<211> 758 
25<212> PRT 

<213> Pyrenophora teres 



<400> 45 

Lys Lys Lys Gly Pro Thr Glu Leu Thr Glu lie Leu Leu Asp Lys Glu 
30 1 5 10 15 

Ala Leu Lys Met Asn Asp Val Val Val Leu Ala lie Gly Glu Glu Ala 

20 25 30 

Ser Lys Arg Ala Asn Glu Pro Gly Thr Met Arg Val Gly Ala Phe Gly 
35 40 45 

35Tyr Pro lie Pro Asp Ala Thr Leu Ala Val Val Asp Pro Glu Thr Asn 
50 55 60 

Leu Leu Cys Ser Pro Tyr Ser He Gly Glu He Trp Val Asp Ser Pro 
65 70 75 80 

Ser Leu Ser Gly Gly Phe Trp Gin Leu Gin Lys His Thr Glu Thr He 
40 85 90 95 
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Phe His Ala Arg Pro Tyr Arg Phe Val Glu Gly Ser Pro Thr Pro Gin 

100 105 110 

Leu Leu Glu Leu Glu Phe Leu Arg Thr Gly Leu Leu Gly Phe Val Val 
115 120 125 

5Glu Gly Lys Val Phe lie Leu Gly Leu Tyr Glu Asp Arg lie Arg Gin 
130 135 140 

Arg Val Glu Trp Val Glu His Gly Gin Leu Glu Ala Glu His Arg Tyr 
145 150 155 160 

Phe Phe Val Gin His Leu Val Thr Ser lie Met Lys Ala Val Pro Lys 
10 165 170 175 

lie Tyr Asp Cys Ser Ser Phe Asp Ser Tyr Val Asn Gly Glu Tyr Leu 

180 185 190 

Pro lie lie Leu lie Glu Thr Gin Ala Ala Ser Thr Ala Pro Thr Asn 
195 200 205 

ISPro Gly Gly Pro Pro Gin Gin Leu Asp lie Pro Phe Leu Asp Ser Leu 
210 215 220 

Ser Glu Arg Cys Met Glu Val Leu Tyr Gin Glu His His Leu Arg Val 
225 230 235 240 

Tyr Cys Val Met lie Thr Ala Pro Asn Thr Leu Pro Arg Val Val Lys 
20 245 250 255 

Asn Gly Arg Arg Glu lie Gly Asn Met Leu Cys Arg Arg Glu Phe Asp 

260 265 270 

Asn Gly Ser Leu Pro Cys Val His Val Lys Phe Gly Val Glu Arg Ser 
275 280 285 

2 5 Val Leu Asn lie Ala Leu Gly Asp Asp Pro Ser Gly Gly Met Trp Ser 
290 295 300 

Tyr Glu Ala Ser Met Ala Arg Gin Gin Phe Leu Met Leu Gin Asp Lys 
305 310 315 320 

Gin Tyr Ser Gly Val Asp His Arg Glu Val Val Met Asp Asp Arg Thr 
30 325 330 335 

Ser Thr Pro Leu Asn Gin Phe Ser Asn lie His Asp Leu Met Gin Trp 

340 345 350 

Arg Val Ser Arg Gin Ala Glu Glu Leu Ala Tyr Cys Thr Val Asp Gly 
355 360 365 

3 5 Arg Gly Lys Glu Gly Lys Gly Val Asn Trp Lys Lys Phe Asp Gin Lys 
370 375 380 

Val Ala Gly Val Ala Met Tyr Leu Lys Asn Lys Val Lys Val Gin Thr 
385 390 395 400 

Gly Asp His Leu Leu Leu Met Tyr Thr His Ser Glu Asp Phe Val Tyr 
40 405 410 415 
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Ala Val His Ala Cys Phe Val Leu Gly Ala Val Cys lie Pro Met Ala 

420 425 430 

Pro lie Asp Gin Asn Arg Leu Asn Glu Asp Ala Pro Ala Leu Leu His 
435 440 445 

5Ile Leu Ala Asp Phe Lys Val Lys Ala lie Leu Val Asn Ala Asp Val 
450 455 460 

Asp His Leu Met Lys Val Lys Gin Val Ser Gin His lie Lys Gin Ser 
465 470 475 480 

Ala Ala lie Phe Lys lie Asn Val Pro His Thr Tyr Asn Thr Thr Lys 
10 485 490 495 

Pro Pro Lys Gin Ser Ser Gly Cys Arg Asp Leu Lys Leu Thr lie Arg 

500 505 510 

Pro Ala Trp Val Gin Pro Gly Phe Pro Val Leu Val Trp Thr Tyr Trp 
515 520 525 

15Thr Pro Asp Gin Arg Arg lie Ala Val Gin Leu Gly His Ser Gin lie 
530 535 540 

Met Ala Leu Gly Lys Val Gin Lys Glu Thr Cys Gin Met Thr Ser Thr 
545 550 555 560 

Arg Pro Val Leu Gly Cys Val Arg Ser Thr lie Gly Leu Gly Phe lie 
20 565 570 575 

His Thr Cys lie Met Gly He Phe Leu Ala Ala Pro Thr Tyr Leu Val 

580 585 590 

Ser Pro Val Asp Phe Ala Gin Asn Pro Asn He Leu Phe Gin Thr- Leu 
595 600 605 

25Ser Arg Tyr Lys He Lys Asn Ala Tyr Ala Thr Ser Gin Met Leu Asp 
610 615 620 

His Ala He Ala Arg Gly Ala Gly Lys Asn Met Ala Leu His Glu Leu 
625 630 635 640 

Lys Asn Leu Met He Ala Thr Asp Gly Arg Pro Arg Val Asp Val Tyr 
30 645 650 655 

Gin Arg Val Arg Val His Phe Ser Pro Ala Ser Leu Asp Arg Thr Ala 

660 665 670 

lie Asn Thr Val Tyr Ser His Val Leu Asn Pro Met Val Ala Ser Arg 
675 680 685 

35Ser Tyr Met Cys He Glu Pro He Glu Leu His Leu Asp Val Asn Ala 
690 695 700 

Leu Arg Arg Gly Leu He Met Pro Val Asp Pro Asp Thr Glu Pro Gly 
705 710 715 720 

Ala Leu Met Val Gin Asp Ser Gly Met Val Pro Val Ser Thr Gin He 
40 725 730 735 
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Ala He Val Asn Pro Glu Thr Asn Gin Leu Cys Leu Val Gly Glu Tyr 

740 745 750 

Gly Glu He Trp Val Gin 
755 

5 

<210> 46 
<211> 2435 
<212> DMA 

<213> Coccidioides immitis 

10 



<400> 46 



99ggtggaat 


ggtggaagac 


aaacgagttt 


ggtagc t at c 


accct aagcg 


aaaggatgag 


£i A 


atgccccccc 


tagccgtccc 


ggat ttggca 


tacatcgagt 


ttgcgagggc 


tcccactggc 


"1 *"> A 


gatttgcggg 


gagtggtgat 


gagccaccgc 


accatcatgc 


atcaaatgtg 


c tgcatgt c t 


T Q A 


ISgcgatagtat 


ctacgatt cc 


caccgattcc 


aataatagcg 


ggaaacccgt 


gccaagacct 


n a a 


cacggcgaaa 


t cctgatgag 


ttatctcgat 


cctagacaag 


gcattggcat 


gat ccttggt 


"3 A A 


gttctcctta 


cggtctatgc 


tggcaatact 


actgtttggc 


t agagt ccct 


agcggttgaa 


*3 C A 


actcccggcc 


tttatgctag 


tttgatcacc 


aagtacaggg 


ctgctctgct 


ggcagcagat 


a r> a 


tacccgggcc 


ttaagagggc 


cgtgtacaat 


taccagcaag 


atccgatggc 


gacaagaaat 


/IDA 

4 o U 


2 0 tttaagaaga 


at t cagagcc 


aaac ttctca 


agcttgaagt 


tgtgt cttat 


agatact tta 


ET A A 

b4 U 


actgtcgact 


gcgaatt cca 


tgaaatcctc 


gccgacagat 


ggttaaggcc 


c ttgcggaat 


/-Art 

o U U 


ccgcgggct c 


gcgaactagt 


tacgcccatg 


ctgtgccttc 


cagaacacgg 


tggcatggtt 


f~ s~ r\ 

6 o 0 


at cagtttac 


gtgactggct 


tggaggcgag 


gagcgtatgg 


ggtgcccttt 


gaaacatgaa 


no n 

I A U 


gtactgccac 


cggaaaagca 


gaaagacaag 


tccgaaggtg 


agaaaaaaga 


agaagagaag 


780 


2 5 ggcggagagc 


caaaggcgac 


gttcgggagc 


agcttgattg 


gtggttctgc 


ggcgccgata 


840 


cgaaaagaag 


gcccccggaa 


cgaccttggt 


gaggtactac 


ttgacaaaga 


agccttgaaa 


900 


aacaacgaaa 


ttgtgatatt 


agcaattggt 


gaggaggcaa 


gaaggctggc 


tgacacaaca 


960 


ccaaatgctg 


tcagggttgg 


tgcatttggg 


tatcccattc 


cagatgcaac 


gttagcgatc 


1020 


gttgatccag 


agactgggtt 


gctgtgcacg 


cctaatgtgg 


ttggtgagat 


atgggttgat 


1080 


30tcaccttcat 


tgtcaggagg 


attctgggcc 


cttcccaaac 


aaacggagtc 


catcttccat 


1140 


gcccgtccct 


accgatttca 


gggagggggt 


cccacacctg 


taatcgtgga 


gcctgaattc 


1200 


ttgcgaacag 


ggcttcttgg 


ctgtgttatt 


gagggtcaaa 


tattcgtgct 


tggtctctac 


1260 


gaagatcgct 


tgcgccaaaa 


agttgaatgg 


gttgagcatg 


gcgtagaagt 


tgcagagcac 


1320 


cgatatttct 


tcgtgcaaca 


tctgattctc 


agtattatga 


agaacgtgcc 


caaaattcac 


1380 


35gactgctctg 


cctttgacgt 


cttcgtcaac 


gaggagcacc 


tgccagtcgt 


tgtcttggag 


1440 


tcgtacactg 


cctcaacagc 


accagtagct 


tcagggcaat 


ccccacgaca 


gctggacgtt 


1500 


cctcttttgg 


actccttggc 


tgagaaatgc 


atgggagtgc 


tataccaaga 


acatcatctt 


1560 


cgcgtttatt 


gtgtcatgat 


cactgccccg 


aataccttgc 


ctagagttct 


taaaaatggg 


1620 


cgccaagaga 


ttggcaacat 


gctatgtcga 


aaagaatttg 


ataatgggtc 


gctgccatgc 


1680 


40gagcacgtta 


aattcagcgt 


tgagcggtcg 


gttctgagtc 


ttccaattgg 


cgtggatccc 


1740 
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+— 4— 4-~ /~f/^r4— /-I 4— /^r+- 

tti-gy Lcuyc 




yCLyctdyyc 


dy y d Ly luL/ u. 


t>y LLaL-yody 


lOUU 


g a. 3. a. ag c a. a. t 


attcaggagt 


cgatt tgcgg 


gacgut.au u a 


tggatgat eg 


CdCCUCCdCg 


i o e? n 


ccattgaata 


attttaacag 


tatcgttgat 


ttacttcagt 


ggcgtgtttc 


t cgccagggc 


x y a u 


gaggaac ttfc 


w- 4— 4— f*\ 4*» 4— e-x 4~* n 

gu t_ a n. izgc u c 


4— 4 — r~i /-y r-i f~\ — 

taLCgacggt 


cgtggcagag 


aaggcaaggg 


4 — 4— \ /—I -> 4— s-x/T 

LdUCdCdLyg 




C n a «a ^ a 4— 4— 

DadgaaattCg 


4— 4— t-\ 4— ^ ^ /t 4— 

attCCaaagL 


tgcagcLgtg 


gctgcgtatt 


tgaaaaataa 


a <**T 4~ rra a "H f" 1 


9 n a n 

U rfc U 


cyccccggcg 


arpaf" rT"f~ t" a f- 
aLl-aL.y ILaL 


4- ( r-t+- < ^i5a4-y- f 4--3f" 
LLLUaLy LaL. 


acgcactcgg 


aagagtacgt 


O.L- LL-y UL.y U-Cl 


X U W 


catgcttgct 


tctgcctggg 


cttggtagcc 


attcccattt 


ccccagttga 


ccagaaccga 


2160 


ctatccgaag 


atgcgccggc 


tttactccat 


gtcattgtcg atttccgtgt 


aaaagecata 


2220 


cttgtcaacg 


gcgaagtcaa 


tgacttactg 


aaacagaaaa 


tegtatctea 


gcatatcaag 


2280 


lOcagtctgctc 


atgttgtccg 


cacgagcgtt 


ccaagtgtat 


acaataegtc 


gaagccccca 


2340 


aagcaatcgc 


acggttgccg 


ccatctagga 


tttactatga 


atccccaatg 


gttgaattct 


2400 


aagcagccag 


cagtgatttg 


gacctactgg 


acgcc 






2435 



<210> 47 

15<211> 812 

<212> PRT 

<213> Coccidioides immitis 



<400> 47 

2 0Gly Val Glu Trp Trp 

1 5 
Arg Lys Asp Glu Met 
20 

Glu Phe Ala Arg Ala 
25 35 

His Arg Thr lie Met 
50 

Thr lie Pro Thr Asp 
65 

3 0His Gly Glu lie Leu 

85 

Met He Leu Gly Val 
100 

Trp Leu Glu Ser Leu 
35 115 

He Thr Lys Tyr Arg 
130 

Lys Arg Ala Val Tyr 
145 

4 0 Phe Lys Lys Asn Ser 



Lys Thr Asn Glu Phe 
10 

Pro Pro Leu Ala Val 
25 

Pro Thr Gly Asp Leu 
40 

His Gin Met Cys Cys 
55 

Ser Asn Asn Ser Gly 
70 

Met Ser Tyr Leu Asp 
90 

Leu Leu Thr Val Tyr 
105 

Ala Val Glu Thr Pro 
12 0 

Ala Ala Leu Leu Ala 
135 

Asn Tyr Gin Gin Asp 
150 

Glu Pro Asn Phe Ser 



Gly Ser Tyr His Pro Lys 
15 

Pro Asp Leu Ala Tyr He 
30 

Arg Gly Val Val Met Ser 
45 

Met Ser Ala He Val Ser 
60 

Lys Pro Val Pro Arg Pro 
75 80 
Pro Arg Gin Gly He Gly 
95 

Ala Gly Asn Thr Thr Val 
110 

Gly Leu Tyr Ala Ser Leu 
125 

Ala Asp Tyr Pro Gly Leu 
140 

Pro Met Ala Thr Arg Asn 
155 160 
Ser Leu Lys Leu Cys Leu 
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165 170 175 

lie Asp Thr Leu Thr Val Asp Cys Glu Phe His Glu lie Leu Ala Asp 

180 185 190 

Arg Trp Leu Arg Pro Leu Arg Asn Pro Arg Ala Arg Glu Leu Val Thr 
5 195 200 205 

Pro Met Leu Cys Leu Pro Glu His Gly Gly Met Val lie Ser Leu Arg 

210 215 220 

Asp Trp Leu Gly Gly Glu Glu Arg Met Gly Cys Pro Leu Lys His Glu 
225 230 235 240 

10 Val Leu Pro Pro Glu Lys Gin Lys Asp Lys Ser Glu Gly Glu Lys Lys 

245 250 255 

Glu Glu Glu Lys Gly Gly Glu Pro Lys Ala Thr Phe Gly Ser Ser Leu 

260 265 270 

lie Gly Gly Ser Ala Ala Pro lie Arg Lys Glu Gly Pro Arg Asn Asp 
15 275 280 285 

Leu Gly Glu Val Leu Leu Asp Lys Glu Ala Leu Lys Asn Asn Glu lie 

290 295 300 

Val lie Leu Ala lie Gly Glu Glu Ala Arg Arg Leu Ala Asp Thr Thr 
305 310 315 320 

2 0Pro Asn Ala Val Arg Val Gly Ala Phe Gly Tyr Pro lie Pro Asp Ala 

325 330 335 

Thr Leu Ala He Val Asp Pro Glu Thr Gly Leu Leu Cys Thr Pro Asn 

340 345 350 

Val Val Gly Glu He Trp Val Asp Ser Pro Ser Leu Ser Gly Gly Phe 
25 355 360 365 

Trp Ala Leu Pro Lys Gin Thr Glu Ser He Phe His Ala Arg Pro Tyr 

370 375 380 

Arg Phe Gin Gly Gly Gly Pro Thr Pro Val He Val Glu Pro Glu Phe 
385 390 395 400 

3 0Leu Arg Thr Gly Leu Leu Gly Cys Val He Glu Gly Gin He Phe Val 

405 410 415 

Leu Gly Leu Tyr Glu Asp Arg Leu Arg Gin Lys Val Glu Trp Val Glu 

420 425 430 

His Gly Val Glu Val Ala Glu His Arg Tyr Phe Phe Val Gin His Leu 
35 435 440 445 

He Leu Ser He Met Lys Asn Val Pro Lys He His Asp Cys Ser Ala 

450 455 460 

Phe Asp Val Phe Val Asn Glu Glu His Leu Pro Val Val Val Leu Glu 
465 470 475 480 

40Ser Tyr Thr Ala Ser Thr Ala Pro Val Ala Ser Gly Gin Ser Pro Arg 



WO 02/42444 



PCT/US01/43381 



39 

485 490 495 

Gin Leu Asp Val Pro Leu Leu Asp Ser Leu Ala Glu Lys Cys Met Gly 

500 505 510 

Val Leu Tyr Gin Glu His His Leu Arg Val Tyr Cys Val Met lie Thr 
5 515 520 525 

Ala Pro Asn Thr Leu Pro Arg Val Leu Lys Asn Gly Arg Gin Glu He 

530 535 540 

Gly Asn Met Leu Cys Arg Lys Glu Phe Asp Asn Gly Ser Leu Pro Cys 
545 550 555 560 

lOGlu His Val Lys Phe Ser Val Glu Arg Ser Val Leu Ser Leu Pro He 

565 570 575 

Gly Val Asp Pro Val Gly Gly He Trp Ser Val Pro Ser Ser Ala Ala 

580 585 590 

Arg Gin Asp Ala Leu Ala Met Gin Glu Lys Gin Tyr Ser Gly Val Asp 
15 595 600 605 

Leu Arg Asp Val He Met Asp Asp Arg Thr Ser Thr Pro Leu Asn Asn 

610 615 620 

Phe Asn Ser He Val Asp Leu Leu Gin Trp Arg Val Ser Arg Gin Gly 
625 630 635 640 

2 0Glu Glu Leu Cys Tyr Cys Ser He Asp Gly Arg Gly Arg Glu Gly Lys 

645 650 655 

Gly He Thr Trp Lys Lys Phe Asp Ser Lys Val Ala Ala Val Ala Ala 

660 665 670 

Tyr Leu Lys Asn Lys Val Lys Leu Arg Pro Gly Asp His Val He Leu 
25 675 680 685 

Met Tyr Thr His Ser Glu Glu Tyr Val Phe Ala Val His Ala Cys Phe 

690 695 700 

Cys Leu Gly Leu Val Ala He Pro He Ser Pro Val Asp Gin Asn Arg 
705 710 715 720 

3 0Leu Ser Glu Asp Ala Pro Ala Leu Leu His Val He Val Asp Phe Arg 

725 730 735 

Val Lys Ala He Leu Val Asn Gly Glu Val Asn Asp Leu Leu Lys Gin 

740 745 750 

Lys He Val Ser Gin His He Lys Gin Ser Ala His Val Val Arg Thr 
35 755 760 765 

Ser Val Pro Ser Val Tyr Asn Thr Ser Lys Pro Pro Lys Gin Ser His 

770 775 780 

Gly Cys Arg His Leu Gly Phe Thr Met Asn Pro Gin Trp Leu Asn Ser 
785 790 795 800 

4 0Lys Gin Pro Ala Val He Trp Thr Tyr Trp Thr Pro 



WO 02/42444 



PCT/US01/43381 



40 

805 810 

<210> 48 
<211> 1836 
5<212> DMA 
<213> Cochliobolus heterostrophus 



<400> 48 



atgtctctct 


ccggcctgct 


QCQctccrccrcr 

ZD^ZD^ ^ zd zd zd 


qacrcrcaccccr 


ctgccaagcg 


tcacctcctc 


60 


lOtccaactgga 


atgccgccca 


gtttgaggag 


ctcaagtact 


cgtacggcct 


cactggtgtc 


120 


gaccaagtcg 


gcaacttctt 


qtQcrcrtccraC 

ZD -3 ZD _* ZD 


acctttctct 


acatgetcat 


tggcatctct 


180 


crcrcatqctcc 


tcatgctccg 


catctccaac 


a t crcr t c t crcr a 

ZD ZD ZD ZD 


acre a c acre cq 


gcacatcacc 


240 


gcaatgggaa 


gcccaaggca 


aaagtactgg 


gagaccaacc 


gaacaagctg 


q t crcrc c c t cr cr 

ZD ^ zD ZD ZD ZD 


300 


ctcaaccgcc 


acatcctcgt 


cgccccgctc 


tggaagaaga 


agcacaacgc 


ccagttccag 


360 


15atcagcagcg 


cgattgacaa 


cggaaccctc 


cctggaagat 


ggcacaccat 


catgctcctc 


420 


atctacgtcg 


gcctcaacgt 


tcrcatQcrtac 


cttgccctcc 


cctacgacgt 


cctcgaccac 


480 


aQcxqaqaccrc 


tcgccgccct 


tcatCfcracQC 


tctggaaccc 


tccrccQccct 

*-~ *— • zD ZD 


caacctcatc 


540 


cccaccatcc 


tcttcgccct 


ccgcaacaac 


cccc t cat ct 


cccttctcca 


ggt c tegtae 


600 


gacgacttca 


acctfcttcca 


ccgctgggct 


gcccgaatca 


ccattgccga 


ggccattgt c 


660 


2 Ocacactgccg 


cttggttgta 


caacaccaag 


cr c t crcr c crcr t cr 

ZD^ ^ ZD ZD ^ ZD ZD '"ZD 


gatggcaege 


cgt egtaget 


720 


gccctccaca 


ccgagggctc 


fctaccr crater cr 


crcr c a t crcr cr c cr 


gaactgtcgc 


cttcaccttc 


780 


atcggcatcc 


aggcctggtc 


cccatt ccgt 


cacgcctttt 


acgagacctt 


tctcaacatc 


840 


caccgcgtca 


tggtcattgc 


tgctctcctc 


ggcttgtaca 


agcacctgga 


gctgcacgct 


900 


ctgccccagg 


tcccatggat 


gtacct catc 


ttcat cttct 


gggcggctga 


gtggttcctc 


960 


2 5cgcctgtgct 


ccatctgcta 


ctacggcttc 


agectgaage 


aacgctcttc 


catcaccgtc 


102 0 


gaggccttgc 


ctggcgaagc 


tgt ccgt eta 


accatcaaca 


tggtccgega 


atggaccccc 


10 8 0 


cgtcccggat 


gtcacgtgca 


catgtggatg 


cctcgcctct 


ccctctggtc 


ctcgcatcca 


1140 


ttttccgtcg 


cctgggctgc 


gaccctgacc 


gacgactcca 


aagagatgac 


gcttcccact 


1200 


ctggaaggcg 


acgtcaccat 


gatcaatggc 


caacccagga 


aatcaaaaca 


aatcagtctc 


1260 


3 0atctgccgtg 


cccgtaccgg 


actcacccgt 


caaatgtatg 


aaaaggcaag 


caaaagcccc 


1320 


aacgagcaat 


tcaccacatg 


gggcttcatt 


gaaggeccat 


acggtggtca 


ccacagtctt 


1380 


gactcgtacg 


gtacttgtgt 


actgtttgcc 


gcaggtgtag 


gcatcaccca 


ccaggtcatg 


1440 


tacctcaagc 


atctagtcaa 


tggcttcaac 


aacggcacca 


ctgccacgca 


aaagattgtc 


15 0 0 


ctcatctgga 


cagtacccac 


gcccgactgc 


ctggagtggg 


tgcgcccatg gatggacgaa 


1560 


35gtcctccgca 


tgaagggtcg 


caagcagtgt 


ctccgcatca 


agctcttcat 


ctccagacca 


1620 


aagggccgtg 


tcgagagcag 


tagegacact 


gtcaagatgt 


acageggcag 


gcccaacatg 


1680 


aggagcttgt 


tggaggagga 


ggccaagcac 


cgcgttggtg 


ccatggccgt 


gaccgtgtgc 


1740 


gcgtctggcg 


gcatggccga 


cggtgtacga 


catgcagtgc 


gcccactgct 


taccgagggt 


1800 


tcggttgatt 


tcatagagga 


agectttacg 


tattga 






1836 



40 
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<210> 49 
<211> 611 
<212> PRT 

<213> Cochliobolus heterostrophus 

5 

<400> 49 

Met Ser Leu Ser Gly Leu Leu Arg Ser Arg Glu Ala Pro Ala Ala Lys 

15 10 15 

Arg His Leu Leu Ser Asn Trp Asn Ala Ala Gin Phe Glu Glu Leu Lys 
10 20 25 30 

Tyr Ser Tyr Gly Leu Thr Gly Val Asp Gin Val Gly Asn Phe Leu Trp 

35 40 45 

Val Asp Thr Phe Leu Tyr Met Leu lie Gly lie Ser Gly Met Leu Leu 
50 55 60 

ISMet Leu Arg lie Ser Asn Met Val Trp Lys His Ser Arg His lie Thr 
65 70 75 80 

Ala Met Gly Ser Pro Arg Gin Lys Tyr Trp Glu Thr Asn Arg Thr Ser 

85 90 95 

Trp Trp Pro Trp Leu Asn Arg His lie Leu Val Ala Pro Leu Trp Lys 
20 100 105 110 

Lys Lys His Asn Ala Gin Phe Gin lie Ser Ser Ala lie Asp Asn Gly 

115 120 125 

Thr Leu Pro Gly Arg Trp His Thr lie Met Leu Leu lie Tyr Val Gly 
130 135 140 

25Leu Asn Val Ala Trp Cys Leu Ala Leu Pro Tyr Asp Val Leu Asp His 
145 150 155 160 

Arg Glu Thr Leu Ala Ala Leu Arg Gly Arg Ser Gly Thr Leu Ala Ala 

165 170 175 

Leu Asn Leu lie Pro Thr lie Leu Phe Ala Leu Arg Asn Asn Pro Leu 
30 180 185 190 

lie Ser Leu Leu Gin Val Ser Tyr Asp Asp Phe Asn Leu Phe His Arg 

195 200 205 

Trp Ala Ala Arg lie Thr lie Ala Glu Ala He Val His Thr Ala Ala 
210 215 220 

3 5Trp Leu Tyr Asn Thr Lys Ala Gly Gly Gly Trp His Ala Val Val Ala 
225 230 235 240 

Ala Leu His Thr Glu Gly Ser Tyr Gly Trp Gly Met Gly Gly Thr Val 

245 250 255 

Ala Phe Thr Phe He Gly lie Gin Ala Trp Ser Pro Phe Arg His Ala 
40 260 265 270 
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Phe Tyr Glu Thr Phe Leu Asn lie His Arg Val Met Val lie Ala Ala 

275 280 285 

Leu Leu Gly Leu Tyr Lys His Leu Glu Leu His Ala Leu Pro Gin Val 
290 295 300 

5Pro Trp Met Tyr Leu lie Phe lie Phe Trp Ala Ala Glu Trp Phe Leu 
305 310 315 320 

Arg Leu Cys Ser lie Cys Tyr Tyr Gly Phe Ser Leu Lys Gin Arg Ser 

325 330 335 

Ser He Thr Val Glu Ala Leu Pro Gly Glu Ala Val Arg Leu Thr He 
10 340 345 350 

Asn Met Val Arg Glu Trp Thr Pro Arg Pro Gly Cys His Val His Met 

355 360 365 

Trp Met Pro Arg Leu Ser Leu Trp Ser Ser His Pro Phe Ser Val Ala 
370 375 380 

15Trp Ala Ala Thr Leu Thr Asp Asp Ser Lys Glu Met Thr Leu Pro Thr 
385 390 395 400 

Leu Glu Gly Asp Val Thr Met He Asn Gly Gin Pro Arg Lys Ser Lys 

405 410 415 

Gin He Ser Leu He Cys Arg Ala Arg Thr Gly Leu Thr Arg Gin Met 
20 420 425 430 

Tyr Glu Lys Ala Ser Lys Ser Pro Asn Glu Gin Phe Thr Thr Trp Gly 

435 440 445 

Phe He Glu Gly Pro Tyr Gly Gly His His Ser Leu Asp Ser Tyr Gly 
450 455 460 

25Thr Cys Val Leu Phe Ala Ala Gly Val Gly He Thr His Gin Val Met 
465 470 475 480 

Tyr Leu Lys His Leu Val Asn Gly Phe Asn Asn Gly Thr Thr Ala Thr 

485 490 495 

Gin Lys He Val Leu He Trp Thr Val Pro Thr Pro Asp Cys Leu Glu 
30 500 505 510 

Trp Val Arg Pro Trp Met Asp Glu Val Leu Arg Met Lys Gly Arg Lys 

515 520 525 

Gin Cys Leu Arg He Lys Leu Phe He Ser Arg Pro Lys Gly Arg Val 
530 535 540 

35Glu Ser Ser Ser Asp Thr Val Lys Met Tyr Ser Gly Arg Pro Asn Met 
545 550 555 560 

Arg Ser Leu Leu Glu Glu Glu Ala Lys His Arg Val Gly Ala Met Ala 

565 570 575 

Val Thr Val Cys Ala Ser Gly Gly Met Ala Asp Gly Val Arg His Ala 
40 580 585 590 
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Val Arg Pro Leu Leu Thr Glu Gly Ser Val Asp Phe lie Glu Glu Ala 

595 600 605 

Phe Thr Tyr 
610 

5 

<210> 50 
<211> 6553 
<212> DNA 

<213> Cochliobolus heterostrophus 

10 

<400> 50 



4*" ft ft ft 4~ ft ft ft ft ft 

tgcctycgcc 


4~ /^r 4r» r-t 4"» 4— rf^r 4— ft 

L- g u gc X- u g u g 


ft ft 4~* fr 4~" frfr *^ -*i 4^~ 

cccg tggaau 


ft 4~~ ft ft ft ft ft ft ft ft 

gccgcggccc 


gciz.gcx-.gcac 


agee uacccg 


b U 


4— «-\ 4— !3 ^1 /"I 




/'t « ft ft 4— 4— <*-t -a ft 

CCCyCULCaC 


ft 4-" ft ft ft 4— 4— ft f ft 

cuycctLgcc 


4*" r~\ f~\ 4— /"i 4™* /^i rf^r 

uccctcctcg 


4— ft ft ft ft ft \— 

LgCCaCaCaL 


ion 


/■t r»(^r/^ r"i ft ft ft ft ft 

ccgccyccca 


/-i »-\ /-I ft /-I 4— ft 

CaaCaCCdLy 


ft ft 4~ * /~Y /^r /«t f \ 

gcugcgacca 


/— t ft ft T\ ft ft 4"~ 

accccgagcc 


geaggecaaa 


ft 4~~ /**r /*i 0 ft ft /^r /~t 

c egcaggage 


i q n 
ioU 


"1 C 4— ft ft i^l /~T "?l « ^ 

xo eg g accacga 


ft ft 4~ ft ft ^ ft ft /~T 

geccgaggag 


ggegatatta 


cacaaaaagg 


gt ccgfcacfcg 


ctgcaccacc 


O A C\ 


accg ccaccc 


gec cccccgc 


gfcgcgcfcaafc 


cagfccgcafca 


gc tatgaaaa 


aegt cgcacc 


"2 n n 


gtgc tgctgt 


cgcag ua.ee u 


agggectgae 


tttgctgccc 


agttgcaggc 


cgacctgaac 


«i 0 U 


cagcagaacc 


caccccaacc 


atccagtgag 


ggct etcgefc 


cccgcaccgc 


atectttgefc 


/ion 


afctccgt ccg 


gt ccgagt cc 


ateaeggega 


ccacaacccc 


cacatatcca 


gctcccccgc 


/ion 
4 O 0 


2 Occcgact cat 


accatgacgc 


ttccgcacag 


ggccaat tgg 


gcgcacccat 


gecatatgeg 


c /1 n 


aacgcctccg 


ccgctgcct c 


ggggggctcg 


cagtacatgg 


catacccgcc 


cagecaagt c 


/T r\ r\ 
D 0 0 


ggccgttt t c 


aagagaagca 


gcfcgggcctg 


cgtacaaatt 


cget ecageg 


caattcctca 


66 0 


cage tgtcgc 


aaggaagega 


gaegtt catt 


ccacggcctc 


aaacgcctga 


atacaaccac 




tcgcgcgagc 


ccaccatgat 


gggcaactac 


gccttcaatc 


cagacaatca 


gcaaagttat 


780 


2 5gatggccaat 


ttggctctcc 


gggagaggee 


agtcgaagga 


gcaccatgct 


cgaggtaaac 


840 


cagggttatt 


tttccgactt 


cacaggccag 


cagatgeaag 


acaatcgega 


ctcgtatggg 


900 


ggacccaacc 


gctactcgtc 


gggagatgee 


ttttctccta 


ccgccgcgat 


tccacctccc 


960 


atgatgaacc 


ccaacgatct 


ccccttgggc 


getgetgaaa 


ccatgatgcc 


gctagagccc 


1020 


cgcgatctgc 


ettttgaegt 


ttacgaccct 


cacaacccca 


atgtcaaaat 


gtcaaagttt 


1080 


3 Ogacaacattg 


gcgctgtctt 


gcgtcaccga 


agtcgcacac 


agecaaggae 


gactgccttc 


1140 


tgggtccttg 


aegcaaaagg 


caaagagacg 


gcgtccatca 


cctgggaaaa 


ggtggctagt 


1200 


cgcgcggaaa 


aggtggccaa 


agtgattcgg 


gacaagagca 


acctctatcg 


aggegacegt 


1260 


gtggcattag 


tgtacaggga 


tacagaaatc 


attgattttg 


tcgtggcgtt 


gatgggctgc 


1320 


ttcattgegg 


gcgttgtagc 


ggtacccatc 


aatagegteg 


acgactacca 


gaaactcatt 


1380 


35cttctcctaa 


cgacaactca 


agctcatctc 


gcattgacca 


cagacaacaa 


tctcaaggcc 


1440 


tttcatcgtg 


acattagtca 


gaaccgtctg 


aaatggccga 


gtggggtaga 


gtggtggaag 


1500 


acgaacgagt 


ttggcagcca 


ccaccccaag 


aaacatgacg 


atactccagc 


tttgcaagta 


1560 


ccagaggttg 


cctatattga 


gttctcgcgt 


gcacctactg 


gtgaccttcg 


cggtgtggtg 


1620 


cttagtcacc 


ggactattat 


gcaccaaatg 


gcctgcatca 


gtgecatgat 


tagcacgata 


1680 


40cccaccaacg 


ctcagagcca 


agacaegtte 


agcactagcc 


tacgggatgc 


agagggaaag 


1740 
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t" +~ ftft^r +~ ft ft 4- ft 


Cagcaccgtc 


cdyaaacccc 


acagaagtga 


ucc ucacgua 


ccucgacccg 


i q c\ n 


ft /T ft ft St Sl «— i *— ▼ ft 
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LCucagcguc 


t tgtt tgcag 


tttatggagg 


ccacaccacc 


JLo 0 L) 
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agacagegae 


catggaaacc 


ccgggt ctat 
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catcaccaaa 


1 Q O A 

xy z u 


t acaagt CCa 


aca tact get 


ageggattae 


ccaggcctca 


agcgcgctgc 


atacaactac 


"l O O A 

j. y 0 L) 


f* aa ana fx ft ai*r< 
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CadtyyCUaC 


SI SI ft Si Si SI ft 4— 4- ft 
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0 n a n 
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eta 4- inna 4- si 4- 4- 
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ft ft ft si si si ft ft ft 4~ 
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" 1 si /—1 4— fr st 4— /— 1 /— f /-^ 

aact.gai.cgc 


j— r y— 1 /—I — \ — % 4— ft ft 4— 
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X tD U 


4— /— f /-I 4— 4- /— r r~1 r-t /— r 
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4~ »—i 4— r-»4"" si ft ft ft 

tctgLacgcg 


Si 4— ft ft ft 4— Si ft ft 

dCL.gycL.dyy 


uggagaggag 


noon 


f*i ft ft si 4~ nrrrrri 4~ 

cgcaugggci: 


ft ft ft ftfTfl 4— SI SI /T 

gcccgci-adg 


ft SI +~ SI ft ft SI Si 

CdLaycayLd 


ft "i Sl /~f Sl /~*f 4~ /™1 «3 /^r 

gaagagi.cag 


si 4— si si 4"— it — \ 4— ft *a 

a tdd L.gaT_.y a 


ay augauaca 


O O Q A 
_^t3U 


"1 A f~Y si ftft si 4™ a si /~t4~ 

j. u y dy y d tddy l 


si 4~~ ft ft si ft ft ft ft ft 

atgeagegge 


si si si 4— ft ft ft 4— si /~i 

aadtygctac 


^* ^ 1^1 si ft 4~~ /^i 4 — 4™" si 

LCC ay -CCUa 


euggrggegg 


ft si /— t +- st ft a 3 a /^-f 

edeuaeddag 


« jrtU 


—> ^ si st st si si ftft 

aaCa.ad.aagg 


agaagaagaa 


gaaaggeccg 


acagagctta 


cagaaatct t 


gc tggacaag 


O /l A A 

z ^ u u 


yaay C L.C uy a 


agatgaacga 


agt cat tgtt 


ctggccat tg 


gagaagaagc 


aagcaagegg 


A 4fc D U 


gcaaacgagc 


ccggcaccat 


gegagteggt 


geett tggat 


accccatacc 


ggatgegaca 


1 C O A 
Z D Z U 


/-i 4— -3 i^-r ft 4— si 4- 4- 


4— si ft si /-1 /~< 4— /-» i» 

tayaccciya 


/t a r< a a <^*f 4— <--« 4— 4— 
yaCaaytCUt. 


ft 4- a f- ft 4~* 4— st 

CUatgLUCaC 


1*1 a 4* a nf" «rta 4— 

CdL.dCL.CydL. 


si ft ft ft ft si ft si 4- /-1 

dy yegdy due 


_ joU 


1 C4- ft ft ft i~ 21 <«-rsi 4— 4~ 

loLgggtagatt 


1 ft ft ft 4— 4~ /-1 si ft 4~ 
CyCCt UCaCt 


ft +" ft 4— ft ft 4- (*^r /~f 

cucugguggc 


4™ 4— ^1 4— ft ft ft si /~r /~t 

ttcuggcagc 


4— /— m •— i st si /— i" /—1 *— i 

tgeagaagea 


tacagagacc 


n <r A A 
Z O 4fc U 


si4-4-4-+-/~ir"'aa+~ /— f 


4~ /~1 /~T SI /~< ft SI 4™ SS 

CtCyaCCota 


1 /-1 (T 4 — H — 4 — r~tft\- 4- 

ccg l. Ltcy LL. 


/-fa ft ft ft 4~ si f~c /1 

y a yy_j' : ' a 9 cc 


ft 4— si ft ft ft ft si /■*• *— > 
CLdCyCCdCd 


<~t+- +- .— r /— 1 4- 4~ rta si 

y u ty cu ty dd 


z / u u 


s—i 4— /— *r — > f^r 4— 4— 4~* ft 

CCCgagtUtC 


4— j—t /-* f—t 4-* 4~" feft 

u gc g u a c u gg 


actcc t egge 


4-4—4- ft\- 4- #-r4- si /-r 

u l ug l. eg tiiag 


agggaaaaat 


_4_4_4_--4___,4_4_ 

auuuguccuu 


Z / b U 


ggacrgtacg 


aagat cgcat 


cagacagegt 


gttgaatggg 


tagaaaatgg 


t cagcttgaa 


z oz U 


gccgagcatc 


/tsj4-sk^i4-4-4-4-4- 

gatactttLt 


tgtgcagcac 


ctigg tcacaa 


gcattatgaa 


ggccgtgcca 


O Q Q A 
Z O 0 U 


2 Oaaaatttacg 


ac tggtaagt 


gage tgccaa 


cagagcaagg 


ac tgt ct aac 


gtgt cat age 


O Q >1 A 
Z -74 U 


tcgucgtLug 


— 4-4-^4-4- — 4- ^4- 

a u T- c l. u a u g c 


aaatggtgaa 


tacc tgccaa 


t cattct cat 


egagaegcag 


Q A A A 

jUUU 


gccgcaucga 


ctgcgcccac 


aaacccaggt 


ggaccaccac 


aacaat tgga 


tataccattt 


1 A £C A 


4— 4— nTfa 4— 4— t -3 /-1 

uuyyaL.L-t_.cto 


f~ si 4™ ft 4- rTsj rra 
LaLL uy ay ay 


(^f 4— ft ft Si 4~ ft ft SI /T 

gugcauggag 


fti- rip-)- 1" +- 0 rT< 

yttL LLL-dUL. 


si si ft si ft ft si 4— /"i si 

aagagca _ca 


+~ 4~ 4~ si ft ft ft ft^— si 

ttuacgggta 


01 on 

Jl-U 




+— /""f *Z\ 4 — SI ft -3 fr ft 

Uga-LaCayC 


si*^i^-i4 — si si+'ssz-isi 
dCCUddtdCd 


ft 4— 4- <*™ /-« si ft ft si /^r 

cuLccacgay 


4— S3 4~ ft «3 Sl ft Sl SS 

L-CdL-Cddy dd 


ft ft ft Sl ft ft ft ft ft Sl 

eggaeggega 


J J.OU 


O C j^vr >-i si si 4— 4— t~tft ft si 

z ogaaatcggca 


s» 4— si +" ft ft 4— /-^r 4~ y—r 

atatgctgtg 


4~ Sl ft ft Sl Sl SI /T 

taggagagag 


+-4—4-" r— r si /~i — » si 4— #— r 

Lutgacaatg 


/-f ft 4— /—I 4— /-i 4— /— r /—1 ^— 1 

gcucucugcc 


ft 4— i*— r 4— /— < 4— — 1 ^— « — 1 y— t 

cugugudcac 


1 O A A 
j-i - U 


/^f 4— *-i — ^ /— _ 4— 4— 4— /*-r 

gcaaagcEEg 


geattgageg 


at cagtgcag 


aacattgege 


t eggtgaega 


ucccgcuggc 


•3 *a a a 
_5 3 V U 


ggcatgtggt 


catttgaggc 


atcaatggca 


cgtcagcaat 


tettgatget 


ccaagacaag 


*3 O £Z A 
J jOU 


caatactctg 


gtgtcgatca 


tegegaagtc 


gt cattgacg 


acaggacatc 


gactccactc 


J 4 z U 


aatcagtt ct 


cgaatatcca 


cgacc tgatg 


caatggcgtg 


t at ct eggea 


ggccgaggaa 


*3 /I Q A 


T H f- ft ft 4— 4- si ft 4- 
-5 UCCtgCtLaCL 


gcactgt cga 


eggtcgagga 


aaagagggca 


aaggcgt caa 


ttggaagaag 


O C /I A 




aggttgcggg 


cgtagcaatg 


tacct caaga 


acaaggt caa 


ggtccaggcc 


"3 £ A A 

_bl)U 


ft ft ft ft si 4*na1" n 

ggegaucate 


+- f~t 4— 4- ft 4~ /~r s» 4~ 
LCCL-LCUyat 


t~t 4~ si /~i si f* ft ft si 4~ 

gcacacgcaL 


T~ yi q y^t* ^ ^ /^f <-~i +- 

LCagaayaaL 


4- 4- / -> r 4- 4_ 4_ — x. ^ — , 

UL.gtLt.at.gc 


uguuedugea 


JDDU 


uguuLcgcgc 


+— 4"" ft ft ft ft 4^ f* 4 — 

T_.L-ggagcL.gt. 


Lugca uacca 


a uggcgccaa 


ttgat cagaa 


ccgguugaau 


o / z u 


gaggatgege 


cggccttgct 


gcatat cctt 


gcagatttca 


aggtcaaagc 


cattcttgt c 


"2 *7 Q A 

J / oU 


35aacgctgacg 


ttgaccatct 


gatgaagatc 


aagcaagtat 


cgcagcacat 


caaacaatcg 


3840 


gccgctatcc 


tcaagatcag 


tgtgccaaac 


acatacagca 


caacaaagcc 


gecaaagcaa 


3900 


tccagtggct 


gccgcgacct 


caagcttaca 


attcgacegg 


catggattca 


ggcgggtttc 


3960 


ecagtgetag 


tctggacata 


ctggacgccc 


gatcaaegtc 


gtatcgcagt 


tcagctgggc 


4020 


catagecaaa 


tcatggcact 


gtgcaaggtc 


caaaaagaaa 


catgecaaat 


gacaagtaca 


4080 


40cgaccagtcc 


ttggttgtgt 


ccggagcacg 


ataggacttg 


gtttccttca 


cacttgtctc 


4140 
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O 4~ /ar/ar/TO O 4~ <al 4- 

a e g g y a a e c e 


4— fi +- 4— n /a ft ft ft 

ucctuyccyc 


anuria /"» o +~ 

acLcacauac 


c egg ugecac 


cegeegacee 


egcacaaaac 


a n n A 


/ai /"I 4— *a +- 4" 4** /ai 

cctaatatuC 


tgttccaaac 


getttcgegg 


tacaagatca 


aggatgeata 


tgcaacgagt 




caaatgttgg 


accacgccat 


cgcacgcgga 


gctggtaaga 


gtatggctct 


gcacgagctg 


/I 1 D A 

<t o Z U 


aagaatctca. 


tgattgegae 


tgatggaaga 


ccacgcgttg 


aegeeegeaa 


gtgaacattt 


/l "3 Q A 


Dgcacgagagg 


acLuccauya 


4- 4*> rf^r /-I +■ ^ q /"i 4^ 

utgeuaaetc 


aacgcagacc 


aaagagegcg 


egegcaceee 


A A a n 


/T/~i/ar /ai f< O /ar *ai /ai a 

gcgccagcca 


-a /a 4— 4— fa /to ft /*■* /a 

acttayaccc 


aaccycaauc 


a s f* ?a 4-» <— r +- f~\ 4~ 

aacacLytct 


-a 4~" /ai >a n a 4— /aT+- 

acecacatye 


a }■ 4- /ar:a a f* /"» /~« ra 
a L Ly dclLLLd 




a 4— ;a /ar/at a +~ 

atgytagcau 


ft *a /n/^r«5 4— /a o 4— -a 

cacyatcata 


/"» a 4— j»-r4- r /~ r4~ al-f 1 

caty uyuat l 


gage cage eg 


■a <^r<at4~ /at ria f* r~* 4— 

agceccaece 


/-t(a*a 4~ /a"4— /ar /a -a 4— 

cgaegegcae 




/ar /a 4— /a 4— /ar *at /ar *a /a 

yCLCLycgac 


/ar /a rr/^f /a /a 4— /™i /-f 4~" 

gcygccLCyL 


/-i ra 4/^ rf i^t 4~" 4~^ 

ca tycccyL u 


gacccegaca 


^— i -a /ar *a r** f~% /™i /ai «a *a 

cagagcccaa 


/ai/ar /a 4—4-4- /ar/a 4— /a 

cgceeegcec 


a n 


/ar4— r-t /a o o /ar m /a 4~~ 

guccadgacu 




y ccay eg age 


jr^ s-*e f-\ n 3 n 4- *a +" 

aegcaaaeae 


ccaeegecaa 


/a /a /a a <ar a /ar a /a r~i 

cccagagacc 


/I fTOA 
^OOU 


xuaaccaacege 


gct tgaaegg 


egagtaegge 


gagatctggg 


egcageccga 


ggegaaegee 


A "7 A n 


tat age t t ct 


acatgt cgaa 


agagege ttg 


gatgeagaac 


get t caatgg 


gaggacgaee 


a q n n 


gaeggagace 


caaatgtgcg 


atatgttcgt 


acaggegatt 


4— — , r-*, r~r ~ 4- 4- 4- 4— 4- 

eaggaeeeee 


geacagegtg 


4fc o b U 


acacggccca 


ttggacccaa 


cggtgcacct 


gttgatatgc 


aggtgcttt t 


cgtgcttgga 


■^t y z u 


age at aggtg 


4-* 4-" 4~ 4™» st 

acacttttya 


agtcaacgga 


cegaaccaee 


ececeaegga 


caeey agcag 


A Q R n 


1 C+" «— i 4— /~r4— 4— <ar -a ;a /a 

lotcuy ttgaac 


j 1 ■ 1 ■ 4— /art- >ai o ft /"i/t 

gtuy Lcaccy 


f~< ra ci 4— o 4- 4** nrf - 

gaaLattyuc 


ccey yaggce 


r**/ar+~ ;a /~«/ar4** 4— 4— /~« 

ggeacgeeec 


4~ 4— ft /ar a 4~ 4- ra /ar fa 
L. LLy a L LLyL 




4- ^4- 4- a f f.J- art 


uaaatacLta 


/—14 - "™»ra/™«!a/*i4—/~i4— 

cuaacacuct 


acagegcege 


4" 4~ 4~ /** f-t m <ar/ar/~i -a 

e e e c c agg c a 


/ar/ar4~ /ar^ar/ar /-•• 4— 4— /ar 


ci a n 


4— 4™ r*>r 4~* 4— /*t 4™ ^ 4— 

LtgLLytCy u 


uguggaaa tc 


uuccgacgca 


aceeccecgc 


a a /ar /ai a 4~" /ar/ar 4— <a 

aagcaeggeg 


ccegegaeeg 


jlou 


Lcaacycaat 


13. l. ugaacgag 


cac cage egg 


ecaeegacae 


4— /ar4~ /ai 4— /~t/ar4~ 4~ 4~ 

egececgeee 


y egcaaaagg 




4— 4™ <p-# ~\ 

ycgacc ccca 


ccggccccgc 


eugggegaga 


ageaacgegg 


aaagaeecee 


gcaggaeggg 




O A 4** /ai ^1 /a ^ /a /ar/ar a *™1 

<i uccacacyyaa 


gatyegcaca 


auageccagt. 


acageaeacg 


/at /at *a 4™ /a 4~ *a fa +^ 

ggaecceaae 


/~t q /~^i^r ^ 4" 4^ 

y y ci c cly y t. u 


3 j U 


cccagatgat 


gat caeggaa 


gagcctggtc 


caegggctag 


catgactgga 


ageaegceeg 


D^i u u 


ggcgaatggg 


cggcccagcc 


agt at caagg 


ccgggt cgac 


■a ra rr *a /ar /a -a /ai /a /ar 

aagagcaccg 


-a »ar 4— /a 4- a o 4a r-r/ar 

ageceaaegg 


CASH 
3 *± O U 


/ar/-t 4— /ar«a ^ai >a ft ft 

ycatyacayc 


t~t ?a 4/" ^ 4— /~r a a +" 

yactacyaaL 


■a ^ 4™ 4~ a 4™ /~i /*t 

aacctauccc 


4— 4~ a /~i an>a /ar/~i *a 

eeacacagca 


/ar/~» ra ^3 /a a /ar /a *a *ar 

gcaacagcag 


/a aa4— a a a o 

caaeaccaac 


c; c: r> n 

D 3 u 


a ft ft f*ftftfti~ a +~ 

dyccgyytaL 


y taLycucaa 


caycaayyca 


4— /ar<-i a nr>r«r<(*'a 
Ly OaLLL-uLd. 


far /a «a o /at a a r<a r< 


L ctel L. L. L- dy L cl 


c c o n 
O 3 O U 


O Cf" /ar4-- /a /a «a fa <ai -a /a 

a Dtytccaacac 


gccaccacaa 


gg cccacccc 


a ra /ar<r /~i /ar 4~ ci<aro 

aaggegeaga 


-a <ai t" a na t" 4~ 

aceacaegae 


/a /a 4-' a /ar /a /ar si /a /a 

cceagcyacc 


c: (a A n 


gcacaccaac 


agacaacegg 


cactctttcc 


ttgccgaccc 


gcgtatgcag 


aaccagggcc 


3 / U U 


aaatgaacga 


gacgggcgcc 


tacgaaccca 


tgaactatca 


aaacgegtat 


cat cegcat c 


3 / b U 


aacaacaata 


cgaatctgaa 


gaegggggga 


gcagactcag 


cggccccgtg 


ccagacgtgc 


coon 


tgcggccggg 


t ccttcatcc 


gggtccatag 


agcagcacga 


ccaagctaac 


aacgacaaca 


C Q Q A 
3 O O U 


3 Oatatgtggaa 


taat cgegag 


tactatggta 


acagcccatc 


gtatgeagge 


ggatacaege 


C Q /I A 

3 y 4t u 


aagatggcaa 


tatccacgag 


cagcaacaac 


acgatgagta 


cacgagtaat 


gcgt catatg 


£T A A A 


gcggaaa cca 


aggagcaggc 


y y aggc ageg 


gcggcggtgg 


cggececcga 


/ar4— 4- #— r/a -a ra -a 4- /a 

geegcaaaec 


^ n £ n 


gcgacagcuc 


egacagegag 


y'y'tgcagatg 


*r\ /~x «T5 /^i i^ar 4 1 " 4^ 

aegaegceeg 


gagaegtgat 


✓ar /a /a /a 4— 4— /ar/~i 4— /a 

gccccegcec 


O ±Z V 


ayaucaauL u 


tgegggegge 


/^c s~i 4-" 4~ /^r /^*t 4~* /^f 

gcrgcugcug 


cceccgcegg 


agcacctgct 


/ar /a 4— /ar/ar 4— /ar /a 4— 4— 

gceggegcee 


OloU 


35cttcttcgca 


gccgggccat 


gegcagtaga 


c g9"9" a tatgc 


gtgagttttt 


ttttaaattt 


6240 


cgtacataga 


gaccgttgta 


tacgeaggtt 


tcaaattaga 


agagegaata 


tgcatatcag 


6300 


ctgttgttca 


atgttctagt 


ttgggaaggt 


taaccccccc 


cccttcccct 


tccaagactt 


6360 


ttcacttgtt 


tgtgtgtgat 


ttaaatctgg 


agatttcaaa 


tctacatctc 


gctatacata 


6420 


ggtgttgttt 


gataaegtag 


ggggcagaag 


ggtatctcgt 


gatattagac 


tgggagttgc 


6480 


40atgaatcaag 


gtgttgagca 


aaaaaagaga 


gagcggtgaa 


gggcgggggg 


gataggtggt 


6540 
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gtgcacgtgg ctg 6553 



5<210> 51 
<211> 530 
<212> PRT 

<213> Alternaria solani 



10<400> 51 

Lys Lys Lys Gly Pro 

1 5 
Ala Leu Lys Leu Asn 
20 

15Ser Lys Arg Val Asn 
35 

Tyr Pro lie Pro Asp 
50 

Leu Leu Cys Ser Pro 
2065 

Ser Leu Ser Gly Gly 
85 

Phe His Ala Arg Pro 
100 

25Leu Leu Glu Leu Glu 
115 

Glu Gly Lys lie Phe 
130 

Arg Val Glu Trp Val 
30145 

Phe Phe Val Gin His 
165 

lie Tyr Asp Cys Ser 
180 

35Pro lie lie Leu lie 
195 

Pro Gly Gly Pro Pro 
210 

Ser Glu Arg Cys Met 
40225 



Thr Glu Leu Thr Glu 
10 

Glu Val Val Val Leu 
25 

Glu Pro Gly Thr Met 
40 

Ala Thr Leu Ala Val 
55 

Tyr Ser lie Gly Glu 
70 

Phe Trp Gin Leu Gin 
90 

Tyr Arg Phe Val Glu 
105 

Phe Leu Arg Thr Gly 
120 

Val Leu Gly Leu Tyr 
135 

Glu His Gly Gin Leu 
150 

Leu Val Thr Ser lie 
170 

Ser Phe Asp Ser Tyr 
185 

Glu Thr Gin Ala Ala 
200 

Gin Gin Leu Asp lie 
215 

Glu Val Leu Tyr Gin 
230 



lie Leu Leu Asp Lys Glu 
15 

Ala lie Gly Glu Glu Val 
30 

Arg Val Gly Ala Phe Gly 
45 

Val Asp Pro Glu Thr Asn 
60 

lie Trp Val Asp Ser Pro 
75 80 
Lys His Thr Glu Thr lie 
95 

Gly Ser Pro Thr Pro Gin 
110 

Leu Leu Gly Cys Val Val 
125 

Glu Asp Arg lie Arg Gin 
140 

Glu Ala Glu His Arg Tyr 
155 160 
Met Lys Ala Val Pro Lys 
175 

Val Asn Gly Glu Tyr Leu 
190 

Ser Thr Ala Pro Thr Asn 
205 

Pro Phe Leu Asp Ser Leu 
220 

Glu His His Leu Arg Val 
235 240 
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Tyr Cys Val Met lie Thr Ala Pro Asn Thr Leu Pro Arg Val lie Lys 

245 250 255 

Asn Gly Arg Arg Glu lie Gly Asn Met Leu Cys Arg Arg Glu Phe Asp 
260 265 270 

5Asn Gly Ser Leu Pro Cys Val His Val Lys Phe Gly Val Glu Arg Ser 
275 280 285 

Val Gin Asn lie Ala Leu Gly Asp Asp Pro Ala Gly Gly Met Trp Ser 

290 295 300 

Tyr Glu Ala Ser Met Ala Arg Gin Gin Phe Leu Met Leu Gin Asp Lys 
10305 310 315 320 

Gin Tyr Ser Gly Val Asp His Arg Glu Val Val lie Asp Asp Arg Thr 

325 330 335 

Ser Thr Pro Leu Asn Gin Phe Ser Asn lie His Asp Leu Met Gin Trp 
340 345 * 350 

ISArg Val Gin Arg Gin Ala Glu Glu Leu Ala Tyr Cys Thr Val Asp Gly 
355 360 365 

Arg Gly Lys Glu Gly Lys Gly Val Asn Trp Lys Lys Phe Asp Gin Lys 

370 375 380 

Val Ala Gly Val Ala Met Tyr Leu Lys Asn Lys Val Lys Gly Gin Thr 
20385 390 395 400 

Gly Asp His Leu Leu Leu Met Tyr Thr His Ser Glu Asp Phe Val Tyr 

405 410 415 

Ala Val His Ala Cys Phe Val Leu Gly Ala Val Cys lie Pro Met Ala 
420 425 430 

2 5Pro lie Asp Gin Asn Arg Leu Asn Glu Asp Ala Pro Ala Leu Leu His 

435 440 445 

lie He Ala Asp Phe Lys Val Lys Ala He Leu Val Asn Ala Gly Val 

450 455 460 

Asp His Leu Met Lys Val Lys Gin Val Ser Gin His He Lys Gin Ser 
30465 470 475 480 

Ala Val He Leu Lys He Asn Val Pro Asn Thr Tyr Asn Thr Thr Lys 

485 490 495 

Pro Pro Lys Gin Ser Ser Gly Cys Arg Asp Leu Lys Leu Thr He Arg 
500 505 a 510 

3 5Pro Ala Trp He Gin Ser Gly Phe Pro Val Leu Val Trp Thr Tyr Trp 

515 520 525 

Thr Pro 
530 



40<210> 52 
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<212> PRT 
< 2 1 3 > Pyr enophor a 



48 



teres 



5<400> 52 

Lys Lys Lys Gly Pro Thr Glu Leu Thr Glu lie Leu Leu Asp Lys Glu 

15 10 15 

Ala Leu Lys Met Asn Asp Val Val Val Leu Ala He Gly Glu Glu Ala 
20 25 30 

lOSer Lys Arg Ala Asn Glu Pro Gly Thr Met Arg Val Gly Ala Phe Gly 
35 40 45 

Tyr Pro He Pro Asp Ala Thr Leu Ala Val Val Asp Pro Glu Thr Asn 

50 55 60 

Leu Leu Cys Ser Pro Tyr Ser He Gly Glu He Trp Val Asp Ser Pro 
1565 70 * 75 80 

Ser Leu Ser Gly Gly Phe Trp Gin Leu Gin Lys His Thr Glu Thr He 

85 90 95 

Phe His Ala Arg Pro Tyr Arg Phe Val Glu Gly Ser Pro Thr Pro Gin 
100 105 110 

2 0Leu Leu Glu Leu Glu Phe Leu Arg Thr Gly Leu Leu Gly Phe Val Val 

115 120 125 

Glu Gly Lys Val Phe He Leu Gly Leu Tyr Glu Asp Arg He Arg Gin 

130 135 140 

Arg Val Glu Trp Val Glu His Gly Gin Leu Glu Ala Glu His Arg Tyr 
25145 150 155 160 

Phe Phe Val Gin His Leu Val Thr Ser He Met Lys Ala Val Pro Lys 

165 170 175 

He Tyr Asp Cys Ser Ser Phe Asp Ser Tyr Val Asn Gly Glu Tyr Leu 
180 185 190 

3 0Pro He He Leu He Glu Thr Gin Ala Ala Ser Thr Ala Pro Thr Asn 

195 200 205 

Pro Gly Gly Pro Pro Gin Gin Leu Asp He Pro Phe Leu Asp Ser Leu 

210 215 220 

Ser Glu Arg Cys Met Glu Val Leu Tyr Gin Glu His His Leu Arg Val 
35225 230 235 240 

Tyr Cys Val Met He Thr Ala Pro Asn Thr Leu Pro Arg Val Val Lys 

245 250 255 

Asn Gly Arg Arg Glu He Gly Asn Met Leu Cys Arg Arg Glu Phe Asp 
260 265 270 

4 0Asn Gly Ser Leu Pro Cys Val His Val Lys Phe Gly Val Glu Arg Ser 
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275 280 285 

Val Leu Asn lie Ala Leu Gly Asp Asp Pro Ser Gly Gly Met Trp Ser 

290 295 300 

Tyr Glu Ala Ser Met Ala Arg Gin Gin Phe Leu Met Leu Gin Asp Lys 
5305 310 315 320 

Gin Tyr Ser Gly Val Asp His Arg Glu Val Val Met Asp Asp Arg Thr 

325 330 335 

Ser Thr Pro Leu Asn Gin Phe Ser Asn lie His Asp Leu Met Gin Trp 
340 345 350 

lOArg Val Ser Arg Gin Ala Glu Glu Leu Ala Tyr Cys Thr Val Asp Gly 
355 360 365 

Arg Gly Lys Glu Gly Lys Gly Val Asn Trp Lys Lys Phe Asp Gin Lys 

370 375 380 

Val Ala Gly Val Ala Met Tyr Leu Lys Asn Lys Val Lys Val Gin Thr 
15385 390 395 400 

Gly Asp His Leu Leu Leu Met Tyr Thr His Ser Glu Asp Phe Val Tyr 

405 410 415 

Ala Val His Ala Cys Phe Val Leu Gly Ala Val Cys lie Pro Met Ala 
420 425 430 

2 0Pro lie Asp Gin Asn Arg Leu Asn Glu Asp Ala Pro Ala Leu Leu His 

435 440 445 

lie Leu Ala Asp Phe Lys Val Lys Ala lie Leu Val Asn Ala Asp Val 

450 455 460 

Asp His Leu Met Lys Val Lys Gin Val Ser Gin His lie Lys Gin Ser 
25465 470 475 480 

Ala Ala lie Phe Lys lie Asn Val Pro His Thr Tyr Asn Thr Thr Lys 

485 490 495 

Pro Pro Lys Gin Ser Ser Gly Cys Arg Asp Leu Lys Leu Thr lie Arg 
500 505 510 

3 0Pro Ala Trp Val Gin Pro Gly Phe Pro Val Leu Val Trp Thr Tyr Trp 

515 520 525 

Thr Pro 
530 



35<210> 53 
<211> 531 
<212> PRT 

<213> Fusariura graminearum 



40<400> 53 
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Glu Glu Arg Ala Lys Asn Glu Leu Gly Glu Val Leu Leu Asp Arg Glu 

15 10 15 

Ala Leu Lys Thr Asn Glu Val Val Val Val Ala He Gly Asn Asp Ala 
20 25 30 

5Arg Lys Arg Val Thr Asp Asp Pro Gly Leu Val Arg Val Gly Ser Phe 
35 40 45 

Gly Tyr Pro He Pro Asp Ala Thr Leu Ser Val Val Asp Pro Glu Thr 

50 55 60 

Gly Leu Leu Ala Ser Pro His Ser Val Gly Glu He Trp Val Asp Ser 
1065 70 75 80 

Pro Ser Leu Ser Gly Gly Phe Trp Ala Gin Pro Lys Asn Thr Glu Leu 

85 90 95 

He Phe His Ala Arg Pro Tyr Lys Phe Asp Pro Gly Asp Pro Thr Pro 
100 105 110 

15Gln Pro Val Glu Pro Glu Phe Leu Arg Thr Gly Leu Leu Gly Thr Val 
115 120 125 

He Glu Gly Lys lie Phe Val Leu Gly Leu Tyr Glu Asp Arg He Arg 

130 135 140 

Gin Lys Val Glu Trp Val Glu His Gly His Glu Leu Ala Glu Tyr Arg 
20145 150 155 160 

Tyr Phe Phe Val Gin His He Val Val Ser He Val Lys Asn Val Pro 

165 170 175 

Lys He Tyr Asp Cys Ser Ala Phe Asp Val Phe Val Asn Asp Glu His 
180 185 190 

2 5Leu Pro Val Val Val Leu Glu Ser Ala Ala Ala Ser Thr Ala Pro Leu 

195 200 205 

Thr Ser Gly Gly Pro Pro Arg Gin Pro Asp Thr Ala Leu Leu Glu Ser 

210 215 220 

Leu Ala Glu Arg Cys Met Glu Val Leu Met Ser Glu His His Leu Arg 
30225 230 235 240 

Leu Tyr Cys Val Met He Thr Ala Pro Asp Thr Leu Pro Arg Val Val 

245 250 255 

Lys Asn Gly Arg Arg Glu He Gly Asn Met Leu Cys Arg Arg Glu Phe 
260 265 270 

3 5Asp Leu Gly Asn Leu Pro Cys Val His Val Lys Phe Gly Val Glu His 

275 280 285 

Ala Val Leu Asn Leu Pro He Gly Val Asp Pro He Gly Gly He Trp 

290 295 300 

Ser Pro Leu Ala Ser Asp Ser Arg Ala Glu Phe Leu Leu Pro Ala Asp 
40305 310 315 320 
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Lys Gin Tyr Ser Gly Val Asp Arg Arg Glu Val Val lie Asp Asp Arg 

325 330 335 

Thr Ser Thr Pro Leu Asn Asn Phe Ser Cys lie Ser Asp Leu lie Gin 
340 345 350 

5Trp Arg Val Ala Arg Gin Pro Glu Glu Leu Ala Tyr Cys Thr lie Asp 
355 360 365 

Gly Lys Ser Arg Glu Gly Lys Gly Val Thr Trp Lys Lys Phe Asp Thr 

370 375 380 

Lys Val Ala Ser Val Ala Met Tyr Leu Lys Asn Lys Val Lys Val Arg 
10385 390 395 400 

Pro Gly Asp His He He Leu Met Tyr Thr His Ser Glu Glu Phe Val 

405 410 415 

Phe Ala He His Ala Cys He Ser Leu Gly Ala lie Val He Pro He 
420 425 430 

15 Ala Pro Leu Asp Gin Asn Arg Leu Asn Glu Asp Val Pro Ala Phe Leu 
435 440 445 

His He Val Ser Asp Tyr Asn Val Lys Ala Val Leu Val Asn Ala Glu 

450 455 460 

Val Asp His Leu He Lys Val Lys Pro Val Ala Ser His He Lys Gin 
20465 470 475 480 

Ser Ala Gin Val Leu Lys He Thr Ser Pro Ala He Tyr Asn Thr Thr 

485 490 495 

Lys Pro Pro Lys Gin Ser Ser Gly Leu Arg Asp Leu Arg Phe Thr He 
500 505 510 

2 5Asp Pro Ala Trp He Arg Pro Gly Tyr Pro Val He Val Trp Thr Tyr 
515 520 525 

Trp Thr Pro 
530 

30<210> 54 
<211> 531 
<212> PRT 

<213> Coccidioides immitis 



35<400> 54 

Lys Glu Gly Pro Arg Asn Asp Leu 

1 5 
Ala Leu Lys Asn Asn Glu He Val 
20 

4 0Arg Arg Leu Ala Asp Thr Thr Pro 



Gly Glu Val Leu Leu Asp Lys Glu 

10 15 
He Leu Ala He Gly Glu Glu Ala 
25 30 
Asn Ala Val Arg Val Gly Ala Phe 
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Gly Tyr Pro He Pro 
50 

Gly Leu Leu Cys Thr 
565 

Pro Ser Leu Ser Gly 
85 

He Phe His Ala Arg 
100 

lOVal He Val Glu Pro 
115 

He Glu Gly Gin He 
130 

Gin Lys Val Glu Trp 
15145 

Tyr Phe Phe Val Gin 
165 

Lys He His Asp Cys 
180 

2 0Leu Pro Val Val Val 

195 

Ala Ser Gly Gin Ser 
210 

Leu Ala Glu Lys Cys 
25225 

Val Tyr Cys Val Met 
245 

Lys Asn Gly Arg Gin 
260 

3 0Asp Asn Gly Ser Leu 

275 

Ser Val Leu Ser Leu 
290 

Ser Val Pro Ser Ser 
35305 

Lys Gin Tyr Ser Gly 
325 

Thr Ser Thr Pro Leu 
340 

4 0Trp Arg Val Ser Arg 



52 

40 

Asp Ala Thr Leu Ala He 
55 

Pro Asn Val Val Gly Glu 
70 75 
Gly Phe Trp Ala Leu Pro 
90 

Pro Tyr Arg Phe Gin Gly 
105 

Glu Phe Leu Arg Thr Gly 
120 

Phe Val Leu Gly Leu Tyr 
135 

Val Glu His Gly Val Glu 
150 155 
His Leu He Leu Ser He 
170 

Ser Ala Phe Asp Val Phe 
185 

Leu Glu Ser Tyr Thr Ala 
200 

Pro Arg Gin Leu Asp Val 
215 

Met Gly Val Leu Tyr Gin 
230 235 
He Thr Ala Pro Asn Thr 
250 

Glu He Gly Asn Met Leu 
265 

Pro Cys Glu His Val Lys 
280 

Pro He Gly Val Asp Pro 
295 

Ala Ala Arg Gin Asp Ala 
310 315 
Val Asp Leu Arg Asp Val 
330 

Asn Asn Phe Asn Ser He 
345 

Gin Gly Glu Glu Leu Cys 



45 

Val Asp Pro Glu Thr 
60 

He Trp Val Asp Ser 
80 

Lys Gin Thr Glu Ser 
95 

Gly Gly Pro Thr Pro 
110 

Leu Leu Gly Cys Val 
125 

Glu Asp Arg Leu Arg 
140 

Val Ala Glu His Arg 
160 

Met Lys Asn Val Pro 
175 

Val Asn Glu Glu His 
190 

Ser Thr Ala Pro Val 
205 

Pro Leu Leu Asp Ser 
220 

Glu His His Leu Arg 
240 

Leu Pro Arg Val Leu 
255 

Cys Arg Lys Glu Phe 
270 

Phe Ser Val Glu Arg 
285 

Val Gly Gly He Trp 
300 

Leu Ala Met Gin Glu 
320 

He Met Asp Asp Arg 
335 

Val Asp Leu Leu Gin 
350 

Tyr Cys Ser He Asp 
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Gly Arg Gly Arg 
370 

Lys Val Ala Ala 
5385 

Pro Gly Asp His 

Phe Ala Val His 
420 

lOSer Pro Val Asp 
435 

His Val lie Val 
450 

Val Asn Asp Leu 
15465 

Ser Ala His Val 

Lys Pro Pro Lys 
500 

2 0Asn Pro Gin Trp 
515 

Trp Thr Pro 
530 



360 

Glu Gly Lys Gly 
375 

Val Ala Ala Tyr 
390 

Val lie Leu Met 
405 

Ala Cys Phe Cys 

Gin Asn Arg Leu 
440 

Asp Phe Arg Val 
455 

Leu Lys Gin Lys 
470 

Val Arg Thr Ser 
485 

Gin Ser His Gly 

Leu Asn Ser Lys 
520 



53 

lie Thr Trp Lys 
380 

Leu Lys Asn Lys 
395 

Tyr Thr His Ser 
410 

Leu Gly Leu Val 
425 

Ser Glu Asp Ala 

Lys Ala lie Leu 
460 

lie Val Ser Gin 
475 

Val Pro Ser Val 
490 

Cys Arg His Leu 
505 

Gin Pro Ala Val 



365 

Lys Phe Asp Ser 

Val Lys Leu Arg 
400 

Glu Glu Tyr Val 
415 

Ala He Pro He 
430 

Pro Ala Leu Leu 
445 

Val Asn Gly Glu 

His He Lys Gin 
480 

Tyr Asn Thr Ser 
495 

Gly Phe Thr Met 
510 

He Trp Thr Tyr 
525 



25<210> 55 
<211> 2073 
<212> DNA 

<213> Cochliobolus heterostrophus 



30<400> 55 



atacgtggtg 


gagccgtgca 


accgttgctg 


tgtgctgagt 


gctgagttgc 


ggtggagaat 


60 


gccccgtggg 


gtcgggatgg 


gtagcgctgc 


aggggtttag 


ctgagatgga 


ggggagagag 


120 


ggggggttgg 


ggatgtttaa 


aaggatgggg 


aggggtgtgt 


tcctgtgctt 


ggatgttacg 


180 


ctgttgcgct 


gcttacttgc 


tacgttgctc 


gtggcagccg 


actcagtctt 


tctacctgct 


240 


3 5ttctttggct 


ctgtctcttt 


tttttattta 


cttggggcct 


ttgagatagc 


tcagagaggc 


300 


gaaagggttg 


gagataagag 


acggtgcgaa 


atagagggcg 


agtacgatga 


gcgtggataa 


360 


aatgcaggat 


gaaaaggttg 


agcggagtgg 


gagtgagggg 


tttgaagagg 


ggcttctgga 


420 


ggatccgaag 


gcaacgagta 


ggttgttgtt 


caagatcgat 


tgtcggtatg 


tttctctctc 


480 


cattcctgct 


ctccatgtct 


ttatcttgag 


ggcttttgtg 


gatgatgtac 


catcctgccg 


540 


4 0gttctcgccc 


tgctgttcct 


gtgctcgttc 


attgatcgta 


caaaccttgg 


gaatgcgaag 


600 
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aLLLL i— y y L. L. 
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L,y y dy dd uy d 
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f*a /-if- ft ft as ft ft c* 
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f"r4— ft ft +" 4— 4— 4— ft ft 
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gcucgaggcc 
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L- y y c dy d 
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a ft ft ft 4— ft ft o /~f 

ddyyLuyydy 


*~\ 4— ft ft a 4~ /~« 4— 
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ft ft 4- 4— 4— a a r~< 

y cuuy auadc 






ft ft ft ft pat - <T /-« 4— 


/-i 4— /— «• fi 4— ai +~" z" 1 +™ 
t_ L.yi_ LdLLLL 


^-i ft 4— 4— /-t /-« riaat" 


4— /~i «a 4— /~( ft ^ t\ 4— 
LCLdL UyddL 


a ft ft ft rrrrt" 4— 4— 

edgecy y uuu 




ft ft t~ 4— as ft ft f ft ft 
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gtdCCCLCCC 


<""»<Tf- f - ft f 4— «3 4— jttj 

cguugcuduc 
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dtcyyCLdCy 
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y ay ucdUCdy 




r</^ a i> +■ ft ft o /~i it 
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ddCCCyddag 
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*^ 4— ft /~< ?a 4 — 4— 

dauy aca.uuu 


4" /"i -a frfr ft 4"~ ^a 

cuca.uggcua 
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caggcauy ua 
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4- 4— /~i /~i 4- /— i /-« (~t 4— r-i 
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CLdLyidLLC 
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4— ir-i /-« 4— ft ft ft ft 
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t-cyycgcLyc 


^ ft 4~" 4~* /^f q 4— 

ay c u cgcgau 


4~ fr ft ft "3 4^ ft ft fr 

ugccaaugcg 


gguugguucg 


ucgcgagcuu 


1 / *tU 


2 0 tacgtatcag 


aagagegaga 


agecgaattt 


ccataagagt 


catagcatta 


tgctggggtt 


1800 


gttgtgtgcg 


gcttgggttt 


tgtaagttct 


cttctccttg 


ttctcttttc 


tagtgtgtac 


1860 


aggtggattt 


ccatcgtttt 


gctggcggga 


tatgeagcta 


acgtgaatga 


tagggtcgea 


1920 


gcgaatgtgg 


cgtgggtgtg 


gaaaatcaac 


cgegataagg 


cgagtggaaa 


gtatgcggaa 


1980 


ttcgaaggac 


gaggagatga 


tagggatccg 


gcgtttaaga 


tggtgatgta 


agggattttg 


2040 


2 5gatctgggtt 


gggttattat 


tagcatgatg 


ata 






2073 



<210> 56 
<211> 487 
<212> PRT 
3 0<213> Cochliobolus heterostrophus 

<400> 56 

Met Gin Asp Glu Lys Val Glu Arg Ser Gly Ser Glu Gly Phe Glu Glu 
15 10 15 

3 5Gly Leu Leu Glu Asp Pro Lys Ala Thr Ser Arg Leu Leu Phe Lys lie 
20 25 30 

Asp Cys Arg Tyr Val Ser Leu Ser lie Pro Ala Leu His Val Phe lie 

35 40 45 

Leu Arg Ala Phe Val Asp Asp Val Pro Ser Cys Arg Phe Ser Pro Cys 
40 50 55 60 
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Cys Ser Cys Ala Arg Ser Leu lie Val Gin Thr Leu Gly Met Arg Arg 
65 70 75 80 

Phe Leu Val Trp Arg Met lie Ser lie Leu Arg Thr Thr Ser Thr Leu 
85 90 95 

5Leu Gly Phe Ala Ser Phe Thr Leu Arg lie Leu Arg Gly Lys Leu Pro 
100 105 110 

Val Trp Gin Met Gin Ser Arg Arg Leu Asn Leu Cys Ser Glu Leu Pro 

115 120 125 

Ser Asn Leu Leu Leu Lys Lys Val Ser Pro Lys lie Trp Leu Pro Phe 
10 130 135 140 

Leu Thr Ala lie Trp Gly Val Leu Thr Met Cys Leu Gly Phe Val Thr 
145 150 155 160 

Asn Phe Ala Ser Phe Ala Ser Val Arg Ala Leu Leu Gly Val Ala Glu 
165 170 175 

15Gly Gly Leu Leu Pro Gly Met Val Arg Phe Trp Arg Arg Asn Lys Pro 
180 185 190 

Ser Phe Ala Asn Ala Leu Leu Gly Leu Tyr Leu Ser His Phe Tyr Arg 

195 200 205 

Arg Gin Glu Leu Ala Leu Arg lie Gly lie Phe Tyr Thr Ala Ala Ser 
20 210 215 220 

Leu Ser Gly Ala Phe Gly Gly Leu Leu Ala Arg Gly Leu Asn Ala lie 
225 230 235 240 

Gly Pro Ala Ser Gly Leu Glu Gly Trp Arg Trp lie Leu lie Val Glu 
245 250 255 

2 5Gly Leu lie Thr Val Gly Val Gly Ala Cys Ser Ala lie Phe Leu Pro 

260 265 270 

Asn Ser lie Glu Ser Ala Gly Phe Leu Ser Pro Ser Glu Lys Ala His 

275 280 285 

Ala Arg Phe Arg Leu Gly Glu Ala Ser Ala Ser His Glu Arg Phe Asp 
30 290 295 300 

Trp Ala Glu lie Lys Arg Gly lie Phe Asn Leu Gin Val Trp Leu Thr 
305 310 315 320 

Ala Thr Ala Tyr Phe Ser lie Leu Ser Gly Leu Tyr Ser Phe Gly Leu 
325 330 335 

3 5 Phe Leu Pro Thr He He Asn Asn Gly Phe Ala Lys Asp Pro Asn Lys 

340 345 350 

Ala Gin Leu Trp Thr Val He Pro Tyr Ala Val Ala Ser Val Phe Thr 

355 360 365 

Val Leu Val Ala He Leu Ser Asp Arg Leu Ala Leu Arg Gly Pro Val 
40 370 375 380 
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Met Leu Cys Thr Leu Pro Val Ala lie lie Gly Tyr Gly Val lie Ser 
385 390 395 400 

Gin Ser Thr Asn Pro Lys Val Gin Tyr Gly Met Thr Phe Leu Met Ala 
405 410 415 

5Thr Gly Met Tyr Ser Ser Val Pro Cys lie Leu Ser Trp Asn Ser Asn 
420 425 430 

Asn Ser Ala Gly His Tyr Lys Arg Ala Thr Thr Ser Ala Leu Gin Leu 

435 440 445 

Ala lie Ala Asn Ala Gly Trp Phe Val Ala Ser Phe Thr Tyr Gin Lys 
10 450 455 460 

Ser Glu Lys Pro Asn Phe His Lys Ser His Ser lie Met Leu Gly Leu 
465 470 475 480 

Leu Cys Ala Ala Trp Val Leu 
485 

15 

<210> 57 
<211> 1900 
<212> DNA 

<213> Cochliobolus heterostrophus 

20 

<400> 57 

ctgccgacgg tagcttcgga gaatccaagt gtgagggcca tgctagcccg agaccggcat 60 
tgcgctaatt ggaccctggc ctgtaacgtg ggaaggacga acagcacagg tgcaggcttc 12 0 

tagggctgca tgcagtgcgc atcatctgca tgcacttgct gtgccaagtc gtgtactaca 180 

25caagtgcgag ttgctatttg taacgaggaa ccttgtattt aaaagtgtat acgtgaggta 240 
cgtgtgttcc agacctccaa atctaaagct actaaaacaa tagaaacagc ggagtctact 3 00 

ccgacaaggt caagtgaaag gcggcggcat aaaagtcaat cgaatcaaag tacacggaca 3 60 

tacgagcaat ctacacacgg tcatggctat agcttacttt cgttctgctt caatcgtatg 42 0 

acgccctatt catgtaagca cagtctacta tagcagacat aagcaagctg cttacctctt 480 

3 0ggacgcagct ggcaatgagc gtgccatcct tggtatacat tctctgggaa acgaggccgc 54 0 

gaccatcacc agcccaaggg gtctccatct cggtgaagat ccattcatct gcgcggaaac 600 
tgcgaggatt gtgaaagtag atggtgtggt ccagactaac catcatgcca atctcaggct 660 
ttgcgtctcc gctctttgcc aggtcttctg ctttcctcaa ttcacgtatg cgctgcttgt 72 0 

ctgattcgtt gacaaagctt tggcgctgta actcggcatc atccatctcg agcagcttct 780 

35taagtacgtc ctcgtcgatg ctcgacctgg ccctgctctt gcgctggttc gagtagcgca 840 
gaagcttgtg cgcacgcgcg acggtgccga tgaagtagct atcggacatg tatgcgatgg 90 0 

cggagagatg ggcttcgtga ccgccagcgg gggagatttt accgcgagcc tttatccatt 960 
gtcggcattt cttggtgtgg ggcttgtcgg agtcgtctgc tagatatgag ctagggcagg 102 0 
cttaaggatg gtatgcgaag tcactcaccg ttttcaatgg gcaacagctg ggtctggaag 10 8 0 

40ggactctggc catcgttggg cgtcttcaag tcgtcgctac cttccttggg cgccgggacg 1140 
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t ctggcat eg 


ggtagatgtg 


ct cgacc ttt 


tgagcgcctc 


cactgttctg 


gcgaacaaaa 


t o n n 


ctcatggtcg 


tagtgaagat 


gacgttgccc 


etttgeeggg 


cctgcaccgt 


cctggttgcg 


T O C C\ 


a a c g a c 1 1 t c 


ccgagcgcac 


cctttctaca 


tggtatatga 


eggggatetc 


ggagttgcct 


i o o n 


g c a. ag g a. t g a. 


agtagcagtg 


cat egaatge 


acagtgaagt 


eggggt c aac 


egtctte tgg 


1 q q n 


Sgcggcgctga 


gtgtctgggc 


aatggcagca 


ccgccaaaga 


tgccgcgcgc 


acegggggga 


1 A A O 


tgccataggg 


gacgagtgtt 


tgtgaagatg 


ttgggat caa 


tgtcggccag 


ctgcgtcagt 


IjUU 


ccaagyacg l 


ucucaauggc 


cgactgggag 


tggteggegg 


gcggggggcg 


gaugagggcg 


loou 


gccatggtgg 


tggctgatag 


ttttcctgtt 


ggtggatcgt 


tctgtgttct 


gcgaaaagga 


1620 


ggccagtgta 


gcaagaccag 


atgcaagcag 


cagcagegag 


cggctgtgtg 


agactttggg 


1680 


lOcgtcgtcatt 


tccggggcac 


gtcaaagcag 


cgcagacgcg 


catgagcega 


ggcacaatga 


1740 


tcatcggcca 


tgtgggagct 


tgtcgcgccg 


aacaegtgae 


tggccgctga 


ctgatggggg 


1800 


ctgactaagc 


caggcggcgc 


caagecgagg 


agcaggctgg 


ctctggggta 


aaaaegtcat 


1860 


actgggcttg 


ccgggccctg 


cgcagatgcg 


tacctggctt 






1900 



15<210> 58 

<211> 368 

<212> PRT 

<213> Cochliobolus 

20<400> 58 

Met Ala Thr Leu lie 

1 5 
He Glu Asn Val Leu 
20 

2 5 He Phe Thr Asn Thr 

35 ■ 

He Phe Gly Gly Ala 
50 

Thr Val Asp Pro Asp 
3065 

Leu Ala Gly Asn Ser 
85 

Arg Ser Gly Lys Ser 
100 

35Gly Asn Val He Phe 
115 

Gly Gly Ala Gin Lys 
130 

Ala Pro Lys Glu Gly 
40145 



heterostrophus 



Arg Pro Pro Pro Ala Asp His Ser Gin Ser Ala 

10 15 
Glu Leu Thr Gin Leu Ala Asp He Asp Pro Asn 

25 30 
Arg Pro Leu Trp His Pro Pro Gly Ala Arg Gly 

40 45 
Ala He Ala Gin Thr Leu Ser Ala Ala Gin Lys 

55 60 
Phe Thr Val His Ser Met His Cys Tyr Phe He 
70 75 80 

Glu He Pro Val He Tyr His Val Glu Arg Val 

90 95 
Phe Ala Thr Arg Thr Val Gin Ala Arg Gin Arg 

105 110 
Thr Thr Thr Met Ser Phe Val Arg Gin Asn Ser 

120 125 
Val Glu His He Tyr Pro Met Pro Asp Val Pro 

135 140 
Ser Asp Asp Leu Lys Thr Pro Asn Asp Gly Gin 
150 155 160 



WO 02/42444 



PCT/US01/43381 



58 

Ser Pro Phe Gin Thr Gin Leu Leu Pro lie Glu Asn Ala Asp Asp Ser 

165 170 175 

Asp Lys Pro His Thr Lys Lys Cys Arg Gin Trp lie Lys Ala Arg Gly 
180 185 190 

5Lys He Ser Pro Ala Gly Gly His Glu Ala His Leu Ser Ala He Ala 
195 200 205 

Tyr Met Ser Asp Ser Tyr Phe He Gly Thr Val Ala Arg Ala His Lys 

210 215 220 

Leu Leu Arg Tyr Ser Asn Gin Arg Lys Ser Arg Ala Arg Ser Ser He 
10225 230 235 240 

Asp Glu Asp Val Leu Lys Lys Leu Leu Glu Met Asp Asp Ala Glu Leu 

245 250 255 

Gin Arg Gin Ser Phe Val Asn Glu Ser Asp Lys Gin Arg He Arg Glu 
260 265 270 

15Leu Arg Lys Ala Glu Asp Leu Ala Lys Ser Gly Asp Ala Lys Pro Glu 
275 280 285 

He Gly Met Met Val Ser Leu Asp His Thr He Tyr Phe His Asn Pro 

290 295 300 

Arg Ser Phe Arg Ala Asp Glu Trp He Phe Thr Glu Met Glu Thr Pro 
20305 310 315 320 

Trp Ala Gly Asp Gly Arg Gly Leu Val Ser Gin Arg Met Tyr Thr Lys 

325 330 335 

Asp Gly Thr Leu He Ala Ser Cys Val Gin Glu Val Ser Ser Leu Leu 
340 345 350 

2 5Met Ser Ala He Val Asp Cys Ala Tyr Met Asn Arg Ala Ser Tyr Asp 

355 360 365 

<210> 59 
<211> 42115 

3 0<212> DNA 

<213> Cochliobolus heterostrophus 

<220> 

<221> misc_f eature 
35<222> (1) . . . (42115) 

<22 3> n = any any nucleotide 

<400> 59 

gatcttcttc acaatagtcg tctttcccgc attgtccaga cccctagtgc tgtcagtcct 60 

4 0tgccacaacc attcttctgc aaacacgcac agcatcagga tgcgcatctc cttgtccttt 12 0 
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aagcgagctt 


ttcgtaatat 


cgaaagcatc 


ttgcgctgct 


caaaatctga 


gaaaatggtc 


180 


ctactaggca 


gaagcaagac 


agtgataatg 


ggcttcccag 


agccaccttg 


gagctaagcc 


240 


gtttgcggac 


gcctgcattc 


aacgccaact 


cgctacgctt 


ctttggagac 


aacgtctttc 


3 00 


cttgtcagat 


gatcgacact 


gcgttgatga 


ctcgaaccag 


ttgggaggtg 


tagttcctcc 


360 


Btttattttat 


tgagacatca 


tggcgacagc 


tgtattgctt 


gcccgtatgc 


tctttcctac 


420 


taacagcacg 


attgcttact 


atgtgaagac 


tcgttggaca 


tgagcgcctt 


accaacaaat 


480 


actcccactc 


tatagaaaga 


agtcgaggta 


aagtgaagtc 


aagtgaagtg 


aacaagcatt 


540 


cactgtatgc 


tttaggcagc 


tccgaccaag 


tgtatagaag 


gctgcatcat 


ctgccattcc 


600 


acctctccac 


cttctgcttt 


tcgccgatcc 


ctcttgcttt 


acttagaacg 


ctcctgatat 


660 


lOccgttccttc 


tacaaaatcg 


aagaaggtct 


gtatcaggaa 


cacagccatg 


gaggactcgt 


720 


gtctttgcac 


ttcactctca 


cagccaccgg 


agcttgaact 


gacaagattg 


ccatctttcg 


780 


atacccacac 


caatatgtcg 


aagcaacaga 


aaaacagaga 


tatgacttgg 


cagcaacagg 


840 


cagacttcgt 


gcaggccaca 


gtaagtgaag 


taagccttat 


gaccgcatca 


tgtctcacta 


900 


tcctttcgta 


atcatacacc 


aaccgtggtg 


aaagccacag 


tagacttttc 


tagcaccctc 


960 


ISaccattcccc 


gctctcaaca 


acacctcatc 


atagttttct 


attactcact 


acctacccca 


1020 


tcatccccca 


ttttaacata 


tcctgcgctt 


gtatgctaac 


aacccaaacg 


acgaaacaga 


1080 


ggccactcaa 


acatcacttc 


cgtcgacctg 


ccaagaaacc 


accacatcac 


aaaattaata 


1140 


agaaagcgga 


tgcagtagca 


aaagaaacgc 


aaaacagcgc 


gccgttacag 


caaaccgagt 


12 00 


caccatccca 


gcctgactcc 


agacgcaaac 


atcgtgcgca 


gcagccaact 


accttaccac 


12 60 


20ctcctaacca 


cgtctgtcca 


gctgcacttg 


ccgtgacatc 


gttgcccagt 


caagacaaga 


1320 


ccaaaaacgc 


atccgaccct 


ccagtcccct 


tgcagaaaaa 


acatacgcgc 


aacaagaaga 


13 80 


gaggtaaaaa 


agaggtaaac 


atgactacgc 


tacgaaaaga 


agcctatgtt 


cctccacacc 


1440 


tgcgcagctg 


tccccctgcc 


aacaaggcct 


ctattccgcc 


acacctgcgc 


agccgccctt 


1500 


ctgccaataa 


agcgacgata 


gatccagggt 


ataaaaatgc 


tgccaccaat 


ggctcacctt 


1560 


25cttcttcgaa 


aaattccaag 


tccacagttg 


ctaccaagcc 


cgagtcagtt 


cagtaagttc 


1620 


ttcttgacat 


tacaggttca 


gactgctctt 


ttttgtatct 


tcaccattct 


gacatgatcc 


1680 


aagaaaccaa 


aacaacatgc 


gcgaggcaac 


accacnttca 


ccagcaacca 


caccacatga 


1740 


gcctgtagag 


catgaacata 


aagacatggc 


tacccccgaa 


aacgtctggg 


gcggctggaa 


1800 


tgaaactgaa 


atcaacaatc 


tgcatgctca 


tcagaagacn 


tgctaaaccg 


cgttggaatc 


1860 


3 Ogtggcactca 


gccgtacaag 


aggaagcctt 


ggccaaaaca 


aagggacatg 


aaatatattc 


1920 


ctggcaaaag 


cgagagcgat 


ggtggtggtg 


tcaactgctg 


gtctgacagc 


aacggagacc 


1980 


ctgactacga 


tgtcaggaaa 


ctgctagact 


ggaacggcga 


ttggctacct 


gctccggaat 


2040 


catggtccgc 


tcgaagagga 


catgaagacc 


gtcaccttgg 


tgcacatgta 


gaacaatgga 


2100 


tgaatggaca 


ctcacaagag 


tgcaccagat 


ccgtatacta 


cccactcagt 


actttcagtc 


2160 


3 5ccgaagatgg 


accttgcaaa 


gagctggcac 


ctcgttactg 


gcttgaggcg 


aaggttgagg 


2220 


gcagtaactt 


gagagaatct 


tggaagacaa 


tctctacttc 


ggacccaaag 


ccgctggatg 


2280 


atacggacat 


tactatccat 


ccaccttggt 


gggaattgta 


cgaggatgtg 


gtctattctg 


2340 


aggtgattca 


cgaggaaggt 


cagggtgaac 


agcatttcaa 


gcataggagc 


tgttacctga 


2400 


acagcctacc 


agcgccggag 


gcaagaatcg 


accctaccga 


tgcagagcat 


cctaccactc 


2460 


40atctgatgct 


ggcttcggct 


gcagaaaagc 


ttcaagatct 


acaacaacgt 


agggaagcta 


2520 
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a <T<Ta a f-f— ft ft 

d.y y ddL-y LLy 


ft 4— f- /— ff— 4— ft ft ft ft 

CCuy LLyycc 


dddCggdaCC 


ycccayccyu 


ft 2 2 4— +— ft ft 2 4— 

gaat t cgatg 


t t tecaatge 


0 c q n 


ddy uudLyyci 


—\ fx ^ -t- /— t/— c4"" f* ft ft 

aga teg t cgc 


4~" ^ ft f \ ft ft 4"~ -T\ 

CtaCgCCCtd 


agaccaacau 


gtacatt cgt 


c c tgt t cage 




uctyuct.ya.uy l 


4— ft 4— 4— ft ft 12 2 4— 4— 

ty uLgycaut 


ft ft ^ ff 4~ 2 2 r^f 4 — /-i 

y gay u-aaguc 


4— ft 2 a fi 4~ 4— 2 ft 2 

Ly aac l Laca 


4— 2 ft 4— 4— /— » 4— 4— ft 2 

tagtt c t tga 


t tgact tgga 


0 "7 n n 


ddd.LL.L-d L-Cty 


aCadyydLyC 


2 2 <t4— 4— 4— 4- ft — * 
dctdgu.T_-L.L-Ca. 


2 2 /"I 4 — 2 2 /~1 2 2 4— ' 

aacLaacaau. 


2 4— 4— ft 2 1 ^-1 feft ft 

at tgacaggc 


gatt tac aac 


/ 0 U 


-J LdL L dLy L Ly 


a nr~i a 4~ a f» a +~ 
dy LdLdLLaL 


f"t"annpaapn 
L LdLyLddL-L 


ft 2 ft 4~ 4~* 4— 2 4~ /■"r 

gay l L uga L.g 


ft ft ft ft ft 2 ft 4- /T 2 

gy eg cac tga 


2 ft 2 4- /-< 2 2 2 4— /2 

agat caaat c 


0 p 0 n 


ft ft ft r* a <"T a ~> 

cyccaycycta 


4— a a ria ni" r^r4~ 

L.caacac uy u 


ft 2 ft 2 <"* +- ft ft 2 

caccag ugca 


ft ft ft ft 4~* 4™ 2 4™* 

ggCC L LCCdL 


2 *2 4— 4— ft ft 4— ft ft ft 

act tggt cgc 


2 ft 4— 4— y— i ^ « ^ fr 

agtctcaaag 


0 0 0 n 


•r\ ft ft 2 2 ft 4— 

a y c a. a. c y a. g t 


CCdgyaCCdd 


4™ /-i /*i f*ftft\~ +-2 4— 

ucccggt Lot 


g Lt.accgaaa 


agattgtagg 


cttcat cage 


0 q a n 


l uyyctu-yctu. l 


CtL* L-y L. dy L- L- d 


ft ft ft ahppf f*ft 
yyLdLLLLLy 


4— 4— /~l ftftff\- 2 2 

L LLLyL LdLd 


/-1 4— 4— 4— 4— ft 2 na 4- 
L L L L L y dy ct L 


ft ft 2 ft\~ 4- /2T 4— 4— (2 

ggcty ttg 1 1 e 


n n n 


/— f 4— <2 a f o f a <*r 
y LLLaLLL Cty 


(-t/i4- a "f" a prra rr 
yL Ld LdLy dy 


o a a a rr/T a 4- 4- 
Ldddgg LdLL 


ft ft ft 2 2 /~r+- /-t4- /"» 

y y uddy cgee 


4— ftffi— ft ft 2 4— ft ft 

teg Lggat eg 


4— /— i 4- f~i /~i 4- 2 ft 2 

t et ectagag 


0U0U 


_L \J ct y y l y y d v ct. 


a a rrp t* a c* f~* 
LdClvjL LdLLy 


ft ft ft 4— ft ft ft ft ft ft 

Ly l LLy uyy l 


/TfTrrt* at*pa /t4- 


2 ft "f - /2 2 2/222 

dLy LLddLdd 


/—1 4— 4— ft ft 2 <2r4— a /2 
L L LLy dy LdL 


■Dion 


pfpaanapprr 
l. L^a.d.y auuy 


nn ppa hpa a *t 
yyL-L-di_i_d dy 


(~trt H*afpaaa 
yy L Ld LL-ddd 


2 ft ft a 4~ 4~ /-i 4— 2 /—< 
duy d L LL- LdL 


t" <2 2 2 ft ft 4— /— « /— i 2 

LLddLy LLLd 


ft ft 2 pna rra a 4— 
LLdLydy dd L 


on 


rfrta /t a ft ft a T" /"y 

y y dy dy l d l y 


Ldy dy dLLyy 


2 4** ft ft ft 2 ft ft ft ft 

acggcagggc 


ft 2 /— f 4— 4— 4— f~i 4™ 

cage, llc tec 


2 /—I /— < 2 4— *T 2 2 

aegcat gcaa 


/■*c4— * 4— 4— ft 2 4— /2 /— < 4— 

gtt teat cgt 




/T+" /2/T/2T4— ft ft ft ft 

gtcggtcggc 


4— /-^ /-i /^i /-« o a a #~r 4— 
tCCCCaaayt 


ft ft ft 2 4*~ 2 4~" 2 2 ft 

ggga ud l. aag 


2 2 ft 2 2 /-< 2 r-l 4— /-T 

aac aac a c tg 


4—4—2 4—2 ft 2 4— fti— 

t tat aga tgt 


4— ft ft ft *ri 4— ft 4— 2 4— 

tgecat c t at 


linn 
J JUL) 


ft ft ari2/2/22./~«:-i 


CCddCyaaya 


ft 2 4™ 4"- /~Y 2 4~ ft ft 2 

gaLL.gaL.yc a 


ftft*f 2 ft ft ft ft 

ggtacccgcc 


ft 4— 2 ft 4— /— *r 4— ft ft ft 

c tac tgt cgc 


aggataaccc 


"3 *3 ^ n 

_5 J O U 


ID dy L LLaaLa L 


ycuycLty ac 


/^r 2 2 2 ffff\~ ft 2 

gaaaggugag 


4— +~ 2 4™ 4"~ ft 2 ftfti— 

ttatt caggt 


2 /•* tf t4~ /— r 4— 4— 2 ft 2 

aggtg ttaga 


atgagac tga 




/** 4-2 2 ft ft 2 4™ 4 -1 
CUaayyaL t C 


agat.caL.gua 


acggu- Lguua 


4—4—4-' ft 2 ft 4— fr 4— r~i 

t tt cac tgtc 


f~* /^i 4*~ +- /-h Hh- 

ccegtat c t c 


atgcggcaaa 




ti ft ft 2 /— c 4— +- ft ft 2 

dy l dy l Ly l d 


CddaCygadL 


4— ft 4~~ /- > t4 — naf"f n 

ug eye ca l lc 


4— 2 ft 4~~ /-« /—i 4— 2 4— ft 

tac tec tat c 


4^ /~i 4~ /T /"< 4 — 

at c tgctgta 


ccggcttggc 




a a a {~t\~ rrpfa 4— 
aaL-dy uyya l 


LdLyddd L Ly 


a4-4-r^4-4-4- rTr i4- 

dLLLLLLgCL 


4—4—4— ft ft 2 4— ft 4~~ 2 

tttgcdtgta 


4—4—2 2 /2 ft 2 4— 2 /~» 

t t aac cat ae 


ft ft 2 4— 12T2 2 ft ft 2 

gcatgaagga 


*5 c n n 


4— 4— ft /—i 2 2 ft ft ft ri 

l LLLaayyy l 


dd uyy L-uy 


2 ft 2 /~r 2 4~ ft ft 4™ 4— 
dy dy dLLLLL 


f— <■ 4— 4— /~» 4- 4— ft ft 2 4— 
y LLLLLLgdL 


/~-f 4— /— 1 r~i 4~~ ft 4— 4— 2 /—* 
y LLL LLLLdL 


4— 4— 4— 4— rt 2 2 ft ft 2 

t t ttgaagga 


-5 £T n 
O tD U 


O flaffa /—i /^f+- a 4— <— «- 
_s UattaCgLaLy 


^ ft ft ^ 4~* ^ 4~ /^r 

dcyyatyaEy 


2 /™Y 2 2 ftf 1 4*" 2 

agaaccgc ca 


rf^i +~ ft ft 4— 2 4— y— 1 2 

c tggt atcag 


2 4— 4— /"**f 2 z— 1 ft 222 

at tgaccaaa 


acttgaactt 


t *7 0 n 


ttcggggctg 


caaatgcaca 


cttigct ccga 


cactgtacag 


tagattt ccc 


ctattttcaa 


3 78 0 


•n <2 ri/T n 4- f- 
aLLyda LdLd 


4— i~n~t 4— o <*~i 4— /— < 

LCydLaCLCd 


2 2 4~* /— f 2 2 ft 2 4~~ 4~* 

aaLgaacaLL 


-T". - 4*™ r"T m 4^" 4** /^i 

gt gat gat t c 


4— ft ft 2 4— ft 2 ft ft 2 

tgeatgaega 


ctgaaatggt 


q q a n 


/— t4— pna 4— -u /~i /~i 4— 
yL.L-.yd LdLL L 




f 2 4" 2 ft ft ft ft 2 ft 

LdLdLLLUdL 


4~ a a 4— ft ft 4- <^t4- 2 

t aat cc tgt a 


2 ft ft ft ft 4— /—1 /— r 2 ft 

aegegt cgac 


ft ft ^ ft ft ft 2 ft ft 
gctcccacgc 




4- f-i 2 4- pro a nnp 

LLCtLycl dy L L 


a c* a rT"f~ rra /~*fT 
dLLdy l y d L- y 


LLdLdy LLLy 


/—1 /— ^ ft ft 2 2 arrpf" 
LLLLddayL L 


f-pia 4~ ft\-ftftftft 
LydLL Ly Ly L 


a (Ti - a a <^ a a 4~ 
dy LddLdLd L 




Q CaappappfP nrf 
_3 O aaLLaLy LLy 


r*nftf~tc* c* a +■ +— 


ft ft a+"4" p ft 2 /t 

ggau ucgagc 


4— ft 2 2 ft 2 <T 2 ft 2 

tyddCdgdCd 


4— ft 2 2 i-T I ■ ft 2 4— /~« 

t Cddg t cat c 


2224" #24— 2^24- 

aaat ct acat 


a no n 

'He U __i u 


gtgagtgtgc 


/"i 4™ 4~ *ri 4~ 4" q ^ 


— % -t- /«i +- ^ 4— /—i 4— 

dLLCCLaLCL 


4— /~i ft ft 2 2 /—I 4— /— 1 ^» 

tcccaac tea 


4— P-» rf~4 /— 1 ^ ft f \ —1 -— i 

taccaaccac 


aacccggaca 


/i n q n 
4 U 0 U 


4—4—4—2 ft 2 /2? 2 4— ft 
LLLaCagaL-C 


uy LdtCLgy l 


r__i 4™ /^l s~i ^ "ra ^ 4^ ^ 

aacccaaa L,a 


ccagacaacc 


atcagacat c 


gacagagt ct 


a 1 a n 
4 14 U 


4— 4— e~ t 222 /2* 2 2 — t 
LLy aadyddd. 




/™i *-* /~i 4— 4~ ft ft 

caacac l ucg 


4 — /^i f% 222 4— ft ft 

tgecaaateg 


222 /— c 4— 2 2 /—* 2 2 

aaagt aacaa 


aayayyyacc 


a 0 n n 


4— ft 22222222 
UCddcldclclclcl 


•^j ^ /^i ^-i ^ ^ 


ft 2 ft I - ' /— t 2 2 2 2 

cagLcaacaa 


ft 222 ft 2 ft ft 2 2 
gaaacacgaa 


2 2 4— ft ft ft ft ft 2 2 
aatggcgcaa 


gagaagaagg 


a 0 ^ n 


O 2 2 /— c 2 2 p~\ 2 2 

j uaagaacaacc 


ccagcaagac 


cacatcccca 


cot cgc cgc a 


gaacgaagag 


gaggaacaaa 


/i *a 0 n 
4 _5 __, U 


gcaaayyctc 


cggcggcc tc 


ttgagcgcaa 


teggagat cc 


agtcggtacg 


tctccttatc 


/i q q n 
4 ,5 0 U 


LLLLLLLLLL 


L-L- LLLdLL L-L 


4 - r~ , aa/*'(^<^>a(^«a 
LLddLLLdLd 


ft 4—2 2 ft ft /— 1 -~\ 

acctaaccca 


4— ft 4— ft ft ft 2 2 ft ft 

tcteecaagg 


paaprtf ppfp 
LddLy LLL LL 


A A A n 
ft *± U 


daUdLLyLLL 


LLLy LLLLy L. 


ft ft ft ft ft ft ft ft ft ft 

cy gey cgc eg 


ft 4— ft ft 2 /~f 2 2 2 4— 

ct cgagaaat 


4- /""»/2r4— /2 2/2 2 ft ft 

tegtcacagg 


ft ft ft ft ft 4— ft ft ft ft 

cccgctgygc 


4fc D U U 


#— r ft ft ft 4 -1 /"i 4— *2 ft 

g a ggg t c t c g 


gcggcaccac 


acgcggcgcg 


ctgggcccgt 


tgatgggcca 


c 9 a ggacgag 


4 jDU 


3 5cgctctgagc 


tgctgggcgg 


caagaaegta 


gatagctaca 


gcaagcccga 


gaagattgcg 


4620 


ggtaaggaac 


agacgggaga 


taatccgttg 


ggcttggatc 


agaegggteg 


atggggattt 


4680 


gaggatgagg 


gtaagaaata 


gaagagtttt 


tgtttgattt 


taagaaagtt 


aaaagtgagg 


4740 


aggccggggg 


agggggtata 


tataaatctt 


ttttgtatgg 


agggaggaaa 


ggaggaaatc 


4800 


aaaacatttc 


actcatgcca 


ctatctccca 


acacaccttc 


ttcaaagtac 


tcgtgttcct 


4860 


40catgtcctcc 


atgttcttga 


tgtgcatgtc 


gccgcgatcg 


aaaegtatea 


tctagctcct 


4920 
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gcactgtgcc 


tgtactatcc 


ccaaaccccc 


cttcctccat 


gctatctctt 


gtacccggta 


4 y o u 


ccggtgt ccg 


gccagtcagt 


at at cataca 


aactatcatc 


tgatcccgct 


t cctcgccct 


D U4t U 


cct cttcgcc 


ctcctcttca 


ccctcttctt 


cgtcttcatg 


atcaaaccgc 


agaatctggt 


biUU 


ctttgccctc 


cggccaaacg 


acaacggaag 


cttgagtatg 


cgcgagcgcg 


aaacaaggtt 




DttLCaCUaCL 


agcugcaggg 


cccgagagcc 


aggacaeggu 


ccagcgugcg 


ggagtcgega 


coon 

O A A U 


ctgcgggttt 


agcaatgtga 


ggggataacg 


agaggatgga 


c 99tgga-tgt 


tgcgtagtgg 


c: o q n 


ccga.tgaa.tt 


cgccgatgaa 


gatgacgtcg 


a 9^gggagcg 


c tgtgaagcc 


gtgtacaagt 


Oo4U 


acacgactgg 


ttcgtcatgc 


gcagtttgga 


taacgagacg 


attggggtcg 


c a ggggt g c c 


04UU 


agaggagtgc 


tttcacagga 


gcgtacatta 


tgaggat cga 




agacttcgea 




"1 f\ f~t/~x 4~" /~i 4"* 

lUggcCCCaaat 


ccagac uytg 


cagggrgrgc 


4-» 4— /^i 4— 4— 4— 

tytcgucccu 


4— j^i q 

yCLtgCyCaC 


m 4^ -(-^ 4— /*~f /«t 4** 4^" 

aLtgLgcccc 


e: e: o a 


cggagttgaa 


gctgagcatg 


ccgatgcctt 


gttttaggag 


cgcatttt eg 




O O C5 U 


gggcggcttt 


ggg a ggtgtg 


gcgggttgtg 


gtgtgagtgt 


gaaact a egg 


gcgcccaggt 




ugt-cgac u ty 


i-i 4* /"i 4™ /^r 4— rr^ *i /*i 

CL-CugtyLdC 


acuggcgcgc 


uggguacguc 


y augacy y y u 


gtgtggrcga 


o / u u 


ggaacaggat 


gggtgcgaat 


gtgcgtgtag 


aaaggatgcg 


aacacgaegg 


tec cage cgc 


3 / O U 


IScaactgcgag 


acgttcatgt 


ccagggaccc 


attctagact 


ettgatgect 


aggecttcta 




catcccatt c 


gctgacgt cc 


tcggatgctt 


cgcgggttat 


ggtgcggtac 


aaatgeccat 


C Q Q A 

b o o U 


ccgccgtata 


tatcaaagct 


tgtaacccgc 


agaegcageg 


tcccagatgg 


ccagccagcg 




cccgtcacga 


ctccatctca 


gaccagcggc 


gtctgtagta 


999 a 9"ttcga 


c t cgatt cag 


C A A A 


aaccttgt ac 


gtctgcggtg 


caagaagcaa 


caagatat eg 


gt ccct gat g 


cangacacaa 


bUbi) 


2 0 taatgccaga 


acacgccc t t 


gt cccciz tec 


at t cct caat 


ccagt at cgt 


cagcaggt eg 


/ri on 


gtaaccccac 


ccct tgccat 


ctttaccagg 


aaactt egga 


t cgegtat ct 


ccactacccg 


ez 1 Q A 


acccgtctt c 


aagcaccata 


t ctt aacaca 


ggcggtaaag 


t eggtccaaa 


caagcacc tc 


coyi n 
bz4U 


ytccuc ug uu 


ccuccaaacu. 


cgacy ugaac 


— i_ 1— 4— 4— i^i r~i r~\ /—% 

attcuLcccc 


T% 4-~ /^T Q f~% 

auy ccaccag 


agccaLtgct 


bjuu 


aat cacggca 


4** 4" y-i +*> 4— 4— 4-* 4— 


catcgeggag 


ategtaaacg 


cgcgcggrgr 


cgucgrcgga 


bJbU 


^ o uaugaggacg 


cgauxcgagc 


agggacgegg 


ug uccgtga u 


ga ucgacggg 


gaggt.yua.gi- 


f A O A 


99"tgggcgaa 


gatgtgcgtg 


ttgatgaggt 


caagggegga 


atgaccaggg 


gtgaccaggt 


r/i on 


aatcttcgac 


gagcgcaaat 


catgggtgga 


tgggagggcg 


atagtaegga 


ccacctcgaa 


/■r/i n 
bo4U 


agtattgaga 


catcggattt 


gcaaacgtgc 


acegttgaca 


caggcagtat 


gtgtcgcggt 


boUU 


cggcgaggga 


acagacaatg 


tcgtggctgg 


tgcgaatcag 


taactgctgt 


gtaggatagt 


r~ t~ r~ r\ 

6 6 6 U 


J uctgtttatta 


gacgcacatt 


tgattugc tg 


ggatatctcc 


atgttct cca 


ttgcgctgga 


D /ZU 


cactgacggt 


cgtctcaagt 


ggctttgt ca 


tsgggs-tgtt 


tgtgtccggt 


aaacaaaagg 


67 8 0 


gagcgacggg 


cgttcagcca 


atgagcgttc 


gaattggccg 


caactagegt 


gaacgctgt c 


£T O A A 


cgcatggcc t 


ccgggcttgc 


tegctcatat 


gtacaggect 


cgtggttaaa 


aTnagcucanc 


£T Q A A 

o y u u 


cgguucaaca 


gacgca ucca 


gagt cattgt 


aauccgagcc 


aaggacac eg 


4— 4^ 4*" 4^ t~t s~*t /™i m 

ugutLCycag 


/T Q c A 

o y o u 


3 5gtccaaagac 


ttgatttcta 


ccacacccan 


acgcaccaac 


agtggggggt 


gtttgtatgc 


7020 


acatcaaaat 


acaacaaaaa 


aggtatatgg 


acaggcatga 


aacgtactga 


atacagtctt 


7080 


cgagagaaag 


catatccagt 


atgaaagcat 


ggccgtgcaa 


aaaaaaactt 


gtatatgaag 


7140 


aaatggtgaa 


gaaaatgccc 


aaacgcttgc 


ttgecaaate 


actaaattcg 


aaaeatacat 


7200 


caaccttcat 


cttcatcgta 


aaaacctcaa 


gaaagctcat 


gtgctgtaca 


caggagcett 


7260 


4 0ggtcaagact 


gttttgccgc 


gaatctgeae 


tgcgagtccg 


acagccaagc 


gtacgcgggg 


7320 
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tggtgcagac 


tcgtctctcg 


taccagtcga 


gctgcctggt 


gcgttttgtg 


tcagactctg 


"7 "3 O A 


ctggctacct 


ccttgaccaa 


gagaagattc 


ccgcttgcct 


a 999 aa 9 ca 9 


gagtgactcc 


•"7 /I /I A 

/440 


tgcctt cttg 


gacgtggcga 


aaccaaacat 


getgegttte 


ttgtcctctt 


ttgaagtctg 


>7 rr n A 

/ 0 u u 


accaacttgc 


ggttgaat tt 


gat ettgetc 


tgegaeggag 


acgccatcat 


ctggcttcga 


1-7 r ^ /\ 

/ b 6 0 


C 4— r*t t~\ /~i 4— /~r /~« 4— +— 

ocgccacyccc 


ft /~«f 4"* ft ft ^ ^ 

ctgatygaaa 


u tc ugucagc 


aucauccagc 


ucgacua ugg 


gagctgttaa 


/ DZU 


gg c ggg t a. t g 


eeggauueca 


gctt catcaa 


gt cgctcaca 


atgecatgae 


GG 99^'99 aaa 


/boU 


gtactcaatc 


tgcac tt ct c 


ggctttctaa 


tggcaaatgt 


tgggcaatac 


tgatagcaag 


1 1 4 U 


ctcaatgatc 


attggaacac 


ctccattcag 


ateeggagge 


geggggtcat 


t cagatgaat 


t 0 n a 
/ O U U 


ctggagagat 


gcgat cagtt 


tttcgttgag 


cgtttcggt c 


agttt ctctc 


gataegtagt 


/ 0 b U 


"1 f\ ft r~i 4™ 4** ft 4~" ft ft 4~ 

xucycuugtyyt 


get tLcagac 


tatcegcaag 


gec ttctagt 


gt ageaagee 


gccagctgac 


*7 q "~> n 

/ y z u 


«—\ *— * -J - ^1 T - T - ft 

aaLCtLCgCa 


gcaagugacu 


cttcatcttc 


uggacua ugg 


caagggggcg 


agaactt ccg 


^ q d n 

/you 


gat att cgac 


acaatcgact 


gtagttgtgt 


egaaaacget 


ggct caagat 


ccggatgaaa 


Q n A A 
0 U 4t U 


gtatttatca 


aatatgtt ct 


ccactagcca 


ttgagagatg 


aatgeacgae 


egactgeagt 


D T A C\ 

blUU 


catttcctgc 


ttgcccgtct 


ccacagctgt 


ct tgt tgact 


actgggtgca 


gccactgtgg 


O T ez A 

0 16 0 


ibtattgacttc 


cagtccttgc 


gaatggaaaa 


ggacagttgt 


gctattagtc 


cgt etaageg 


Q O O A 

0 z z u 


at tgaagcgg 


gt tgaatatt 


cactgtcatc 


ccacgatgtg 


cgcgaaacag 


ataaaege tg 


D O O A 

0 Z 0 u 


gu rxgcaagg 


gtat uc ugca 


gttggtggac 


cugggruugx: 


tgt tcgaaaa 


agtatctttt 


Q O A A 

0 J 4 U 


gacctt ttga 


tattuuccac 


ctgcagcctg 


tcaaac t caa 


tgatcttccc 


c ttgagat at 


Q /I A A 


tgtacttacg 


taaaacatca 


tgatccttta 


t cat ct ttt c 


aatttcctcc 


teegt cattt 


Q yl /T A 

0 4 b U 


2 Ocggacgcaga 


tgtgcgcggt 


99 a 99 u 9 uc 9 


cccgccccat 


tt c tgagect 


tgt t gacctt 


Q C O A 
O O Z U 


gcttgt cage 


t cgatatggc 


tgt ccttggg 


aggctgtggg 


tacgacgttg 


gcgt cgaacg 


Q C Q A 
O O O U 


ttgegaggee 


eggagacaca 


ggcggctggc 


cgaaagt ctg 


gecttgageg 


gaeggggget 


oo4U 


ggagctgaga 


tgact ettge 


gaeggggetc 


cc 99tt9t9 a 


ttggctactc 


tegtt cact t 


O ""7 A A 

0 / uu 


tcct cacagt 


f*c ft 4— 4™ fi 4*~ /^c /■■* 4"* ft 

gccucugcug 


ctact t tgag 


cact tggtgt 


cataccct cc 


tgt cct t gc t 


Q *7 C A 


z ogacgegatye 


/^-f* 4** 4*"" 4** ft 4*" 

yuugtucagt 


gaatgtagct 


ccgaagaizag 


agggugct tc 


ft fi fi ^ fr f^i 4~ /— «■ ft f* 

ccctgctgcg 


p p 0 A 
O O Z U 


attcgttgtg 


ttggtagggc 


gtgggctgaa 


gaggegaegg 


c 9999 Gaa 9 c 


tgatgggacg 


Q Q O A 
O O O 0 


a gggcggtcg 


gcgttgctgg 


gactgctgtg 


gctcgattga 


ctgttgtaca 


gt ctgaggat 


O Q A A 

0 y4 u 


agtgttgtcg 


tatggcctgt 


gggtggtgca 


gttcggtctg 


cagtt gcgcg 


ateggcagag 


Q A A A 


cagtttgagg 


ctggcgctgg 


tctggttgag 


gctgaagttg 


agatacaaag 


tgaggctget 


Q A ez A 


3 Ogatgatgcag 


getttgeggg 


gaattctcta 


gcgatgattg 


attatgeget 


99 c 99 a 99 ct 


912 0 


gttgctgctg 


ctgctgctgg 


tactggtaat 


aagttc tgtt 


gaatgtcttg 


tac tggagee 


918 0 


ecu ugecca t. 


ft /-t ft ft i^t -J- - j^c4— 

geggcugagu 


ggacaaagag 


CEEggcyCgu 


tttgtggaat 


ft 4— ft s~*r 4— ft 4— /^r ^ 

gtacgtctga 


Q O /I A 

y z 4 u 


aagttgcctg 


accccgttga 


actctgggta 


gtgggctggt 


agctgtgctg 


tggctgcttg 


Q "3 A A 


ft ft ft 4— /~r •*> /— < 4— +- 

eggugaacu u 


gcuguuguug 


cgttggttgt 


+- r-r+- 4- /-t4- 4- /-r4- 4- 


gttgttgctg 


ctgctgtgga 


y J b U 


35ccttgacctt 


gttggtgttg 


ctctgggccg 


tactgatcta 


ccccgccttg 


ggtcgaatag 


9420 


ctgccctctg 


tgctcaccct 


gccaagtggg 


ggccgaacgt 


aaggttcctg 


gtgcaactgc 


9480 


geagegaacg 


gctgagggtg 


ttgcacgttt 


tgttgatatt 


tggtgtcttt 


gggaggcact 


9540 


tgaggggact 


cgtcgctttc 


ctgctggagg 


aatggatcaa 


ggttgtcctc 


gtcttgetet 


9600 


cgcgatgtgg 


gcaagtggga 


agaggacccc 


tggtgttgat 


gccaatgtgg 


cggcggctct 


9660 


4 0ctgcctgtgc 


gtgatactcg 


tgcgcccgag 


ggtcgctttt 


ccgtaccgac 


tgtcgcctgc 


9720 
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ttaggccgtt 


cttctgagct 


ttggtctcct 


cgtccttgtg 


tcctctgacg 


cccgcggcaa 


9780 


tgcgactgcg 


cagcgactgc 


ttcttgtttt 


cggggctgct 


gtatgcagga 


ggaggagagt 


9840 


gaggctggta 


ggggacgtgg 


tttggcgatt 


cgtggcctgc 


ctgagatgta 




9900 


taggctgtfcg 


ctggcagttc 


tgcgatggct 


ggcctaagga 


gaggtgtgca 


ctcagtctgt 


9960 


5gagttgcatt 


tcgcagagag 


tggtattcgc 


tgtattggcc 


ctgggtgctt 


ccagaagagg 


10020 


ccggtggttg 


tagagaagac 


tgctggctat 


ggacgggcgg 


ttgatgggcg 


tcggaggcct 


10080 


ggtatcgctg 


ggctgctgcg 


gcttggtcgc 


tggcttcctt 


tgcagatggc 


acgggcgaag 


10140 


gttggtccct 


tactacggtg 


ttatcgacag 


agtggtgcga 


ccggttcgac 


tttgtccatg 


10200 


gaaagttagg 


cattgcaatg 


atgacagctc 


ccagttccgc 


cgcgaagtat 


gttgatgatg 


10260 


lOgcggggcgga 


gggtgatgcg 


cccagctaat 


aataaaccaa 


gcctgagcaa 


cggttaacct 


10320 


aggcgatgat 


tgcacccgaa 


cgagacagca 


ttgcgcgcgg 


ttggacagct 


gtcttgaaaa 


10380 


ggatgggatg 


ggctacatag 


tagatgcgcg 


tcttccgcca 


tcagccccag 


ccctgcccgc 


10440 


acgtgcaggg 


ctgatgagca 


aatgaaacca 


gatgcaggcg 


cgagtcccaa 


tcccggtgca 


10500 


gctaactgca 


caaggagatg 


ctgatatggc 


ggcggtggca 


gtgctggcgt 


cccgcgatgc 


10560 


IStgctttggta 


gtctaccaca 


ggcatgttac 


tgtgctatgc 


cgtctagtcg 


ctgtcaaagt 


10620 


gtactagttg 


aacgctgtat 


tatcatttgt 


catggcggat 


gcaacgcacg 


acaagcacgc 


10680 


caagtggggg 


aacagttcat 


gtacgtagat 


aagtatgacc 


cactggagat 


ttctgcctcg 


10740 


aggagacacc 


cagccttgcg 


ctccattctt 


cagcntgctg 


tctagttacc 


gaaggcgcag 


10800 


agcatacacg 


tgctacgtct 


ccacgagaga 


ggtaagcagt 


cctcctcagc 


tttgcccata 


10860 


20tccatcacca 


ccaaatcttg 


atcaaaccca 


cagtgtctta 


cagtcaaaaa 


aagtatcaca 


10920 


ttacctaaac 


acgactttcc 


aaaattccat 


ctcacgtatc 


ttggcctatg 


ccggcccttt 


10980 


gacgcgcagg 


atcagctcga 


cgcccgactg 


ctcctaacgc 


tgcgcaggca 


acaaaccggc 


11040 


ccttagtata 


acaggcgtcc 


acgatcaggc 


cgccagcctg 


agtcaactaa 


aaagcaccgc 


11100 


tgctgtattc 


tgcacaaggt 


aagcaaaaag 


cgtctcgaaa 


gcttgacgtc 


gcaaggcggg 


11160 


25gggcttattt 


gaccatacnt 


tactcttacc 


acttgccgca 


acatcatgtg 


tcgcgcgctg 


11220 


tgtatgttgc 


aagtatggta 


agcacagcgc 


caatcggatc 


tttacctcaa 


ttttacgccc 


11280 


tccatgacat 


tttgctggct 


caggtttccc 


cntgctgctc 


cacagttagg 


cccagtctac 


11340 


tctattttcg 


gccaccggac 


gtgtgctagc 


tttactccat 


ttgctaggca 


tctgcgatat 


11400 


tactcagtcg 


tcttcttacg 


tagccttcag 


tgctggattt 


gatatcagat 


acttgctcac 


11460 


3 0cctaattcgt 


cactctgact 


actggattgt 


atctgggctt 


cacgattgca 


ttggtgattt 


11520 


tcaatgacta 


aaaaaaggtg 


cttgcgggtg 


tttcaaaagg 


aatgcatgtg 


tatgtcagag 


11580 


cggttatgca 


cgcttccttc 


agattgttgc 


ctccaaaaaa 


ccaacatatg 


cagcattgcg 


11640 


gcgctgggca 


aggaacgccg 


caacaatgtg 


acatcgcgca 


gcgcttccta 


atcagcctta 


11700 


cactctcact 


aaatcgtcaa 


gatcaagaac 


gaaagggtcc 


aatgccaaac 


gtggtatcat 


11760 


35ttcctgtgtg 


gaagaagcat 


gacttcgtaa 


atcaagaggt 


tgatgtacat 


attactgaac 


11820 


ttcttcaaca 


taacccaacg 


cctcgccaac 


cttgactttt 


tgcccttgtt 


cgatattcca 


11880 


gtgaaaccca 


ccttttctct 


cgccacgtgt 


accactaaag 


ccctcgtcca 


aactaggtcg 


11940 


aatgcccttc 


ggcgcttcaa 


agactaagac 


aattgtactg 


cccaactgaa 


aaccacccat 


12000 


ttcctcgccg 


cgcttgagtg 


cgtaccctcc 


caagacacgg 


cttgcgctcg 


tgtaggaggc 


12060 


4 0ctcagcgaat 


ccagaatacg 


gctcaccacg 


ggcagcggct 


tcttccgcag 


cacggtccgc 


12120 
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ft ft 3 /— t4" I 11 ft f* 

cycaycyccy 


J ■ 4— 4— 4- 3 3 ft ft 

guugu u aayc 


cgc ucgcgcg 


aagt t cgega 


tcaaagttga 


tct t aatgga 


±Z JL o U 


f \ *^ ^-i /—r 4- 4- 

aCCaaCgCUy 


g ttgcgccga 


ccggagtgta 


ggaaaagaaa 


ccccagcgcc 


at ct tec tag 




ga.ga.acc a. c a. 


cgctcgt tea 


gggtaaagag 


accaggcata 


gtgcgttgta 


ggtagggega 


t o *3 n n 


4— ^ ,<-i 4— ^ 4— -^i — . 


^ /^r /™i 4* ft ft ft ft "~\ ft 

ay c ucy ccay 


caaagugacg 


aegegae uca 


*1 ft —i T* ft /—i 4- 

acaaccca ug 


auacagguga 


A. A J5 O U 


oy tgg aacc uy 


4— ft ft 4** ;a /— r 4~ ft ft ft 

tggtag uege 


/—I 4" ft ft ft ft ft T\ ^5 /— r 

cuygcy caay 


?i4- -a 4- «j « m ^« ft ft 

auauacaacg 


ft 3 /-r4— 3 »— r4- ^ <^T 3 

cayuay uaya 


ft 3 ra i^i /-f 4- <n ft ft 

gaaccguagg 




4— /— r4— /— i 4— 4— 4— 3 3 4— 


gaggegggtg 


cccaccaugg 


/^i r**r 4" /*<4 — *^ 4~ 

gcgcrgugau 


4— ^-i ^ 4— ft ^» frs-*r 

u cac ucaagg 


caagguegge 


i o/i on 
l/i4 o U 


4~ y**1 -1— 4— j~i 

acyuactucy 


gc uucugacg 


4™» rf^/*r rf^i 4^ 4^ 4* 

auggcuu uga 


eggaac ugau 


4— /T 4~" y**1 y^l /^#4" #^1 

ugauccgucg 


/— ^ y*i 4* 4* 4* y*i y^r y— i 

gca uu ucagc 




ayyctucccy 


4— y*i 4^ 4~* 4* 4~" 

uc u u u uggee 


auyguccgga 


gaagagguuu 


/^T/^4 - /^r »~i /~>r ■{— ^ 4 — 

gguagaguau 


4— y^r -3 y^r *^ 4" ^ y^i y^t 

a ugagauacc 


-LX O U U 


z-~r 4- 4— ft *r\ ft r-r 4— 4- 4- 


ycaaau ucc u 


ca uccy cgey 


cacagugucc 


j_ ^4- 4_ / -,,~4- ^4-4- 

uc uucg uc u u 


/~*r 4^ rT/^f 4™ 4~ y^i 4 — 4~ 

gt-ggLgtcuL 




"1 A /— i 4— /— i /— f 4- ft ft 4" /— i — > 

lucucguycuca 


c uaycy cy aa 


4— 4" 4— ft ft ft o 4— 4- 

u uuyyya.au u 


4" ft ft 4" a n +• +" +" 

uycuacauuu 


4— ft ft 4- ft 4- ft ft 4~- *— y 

ugcuc uggug 


4- 3 /-i 4— f tff\*\ ft ft 4- 

uacuggnccu 




4™ /"< 1- o ft a 4" rtrif" 
Ly L ay aLttL 


o ft ft ^ ft o ft ft ft 4" 

«.y Lay dy uy u 


LLdddL LdLd 


Uy U LaLaLLL 


4- 4" <**r a /—i 4- 4- ft ft 4- 

uuyctuuuyuu 


ft o a /-< 4- 4- /— < ft ft ft 
UddL L ULyLL 


J- z / o u 


it 4- ft ft 4™ /-i<T3 4" r-r 

y tyCLCyduy 


1 ■ ft ft ft *^ 4 — 4^ 
yLyCCaaaLL 


ft ^ ^ 4- ft ^ 4~~ /—i 4- 4— 

y aa uy o-ucuu 


ft ft ft ft 4— ft 4— ft ft ft 

gccgucugcg 


ft ft 3 i^-r 3 /-t 3 r^c 4— 3 

gya.y ayagua 


ft 4- ft ft ft 4^ 4- ft ft ft 

c uy cguuggg 


1 noAn 
1Z D^t U 


/ 1 1 ■ 4- ft ft ft -~l 4" <""t 4— 

gtuyggauci: 


— » ^ ft ft t\ i^i ^^4* fi 

ayayy acy ua 


/— t ft ft ft 4- 4- 4— 

caccyyy u u u 




4— m ft *^ — \ — > ft ft 

uayaaaaagg 


eggegagguu 


n o q n n 


g gg g ta taca 


uguay a ucug 


guuecyagae 


uueggagaga 


cuagyaaaag 


4- 4- "^t ft -l4- -a «>-i /— r 

u uaya uac ay 


y d u 




gattatgggg 


egacttge tt 


gacaccaaat 


at ccaagaat 


acagcttgaa 




tccagycaca 


cgaaggtagt 


agggtatgt c 


y^-r ^ 4— 4^ **i 4~ 4* /-v 

gauc ucauug 


aagcgacccc 


<~-» y-H y^ 4* y«* s~f y-^f «^ 

acag u cgega 


1 q n q n 


caacycuLty 


agaggaaggg 


uagacaugac 


eugaaeggue 


caugguccgc 


4* y**l T/*^r 4^* y^l 4~ 4~ ^ 4 — 

ucyy uc u uau 


±J5 ±4t U 


4— /^i 4—4—4- ft ft ft ft ft 

ucuui-cycy c 


4- 4" /'i 4- 4- <^ <t<t — ^ * — i 

u ucuucyyac 


gacc uuyc uy 


4™ y^l /^t 4— -*i ^* 

atccac uaca 


4— 4— rt ft ft -~\ 4- <-t ft ft 

uuyeca uccy 


csucccuucu 




ft 4" /^/"frm 4— 4— /-* 4- 

uuuyyuuuu u 


fr+- 4- ft f, 4-4-4-4- 

yuciuyuuuuu 


/**< 4" ft 4* ft ft ft 4— 4* /^r 

uuu uyuy u uy 


4"^4" ft ft ft 3 1-31- 

LdLy LydLdL 


a ft ft 4* /t 3 3 3 4- <-t 
dy U LydddLy 


ft ^ ft ft ft ^ ft ft 3 3 

LdLLL ciy y cl ct 


J. O A o u 


O A /— t ft ft 3 3 4- a ft ft 
*u UyCCaotaCCg 


^> /— r 4- /— f /— < «3 4— 4- ft 

ay uy caa u uy 


/— «r 4~* ■— 1 4™ 4" ft ft ft 4" ^"i 

y uau uyycuc 


ft ft 3 4— 4- 4- ft ^i /— 1 4— 

ccauuugacu 


4— 4— ft ft 4- •— i 4— 4- * 4— 

uuyyuauucu 


4- ^-r s ft 4~~ ft ft ft ft 3 

ty ay ugcy ga 




4" 4" ^-T "3 i^T ft ft ft 

a u uy ay ccy c 


gacctaaaag 


actccttcct 


gtgaaaatt t 


4— ft^f 4— ft ft ft 4— •— h 4— 

ugu uegcua u 


atgacget eg 


i lion 


ay ucgacgug 


a-3-ggtgcgag 


atggggcaga 


gaggegegea 


4— /**c4™ ^f/^f/T 4— y^c 4^ y~< 

^gcgggtgtg 


4™ rf/^rj^fl - — » 4— 4™ 4— 

ugggua uau u 


i i / /i n 


gcat cgcggt 


eggatagaag 


c tcgaatgag 


ctggcgcgaa 


cagggagtcg 


ccatgttatg 


t *3 c n n 


4- ft ft 3 4~ ft ft a 4- 

a ugca uy cat 
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ttgacagtac 


ttggtcatta 


tegegttate 


a gt9"9" aa -taa 


aagtatgtac 


21600 


aagaccagag 


cagatactca 


eggtaggaac 


acaggtttct 


cagcatccat 


ataccttgtt 


21660 


40gtatcgtcat 


acatgttgat 


catctcctct 


gcaagtaatc 


cactccaaca 


tcccataggt 


21720 
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caatagcaag 


atgtaagtga 


ttgaaactct 


cactgntcct 


gcatcatgtg 


ctacctacgg 


21780 


gctcttccgt 


accagcaatc 


tctcgagcaa 


gcaatcttgc 


ttccgagatc 


ttaggcaggg 


21840 


tatctcgaca 


agcgaatata 


tatgtattga 


tgacgaaacc 


cccatgtctg 


gtctctgaga 


21900 


gggctatgtg 


caaatagcct 


gaatgatcct 


acgtctgccg 


ggggatctac 


ggcaagcaaa 


21960 




agacgagtcg 


aagagaaaag 


agtagaggag 


aagatgttta 


caattcctag 


22020 


gtggatggga 


gtaccgaatt 


cgtttggtgt 


tacgctcatg 


ttgagcaaca 


acagctgtgt 


22080 


cactcgctcc 


actcgttgaa 


atctcgtata 


tgcaacggat 


gcggataacg 


tagattgaat 


22140 


gatggtacac 


tggtaaccct 


ggtgtatcgc 


aagtaagtga 


ccctctcttt 


ctctgtagtg 


22200 


gttccttgca 


gccatcaaga 


catggtcttc 


gccacgtgcg 


cacatccaca 


gtgctcgacg 


222 60 


lOcgcggctcgc 


gtggaagccc 


catagattct 


ctagatcgtc 


aatcgatgtg 


gcgcatgtat 


22320 


cagcacgttt 


catacattga 


acgcgcaccc 


cgtaccagaa 


gtaaaaatag 


taaagcttaa 


2 2 3 8 0 


ttctgagcgt 


agcagatgat 


gctcagccta 


cgccatgaca 


gatggatggc 


ttgactcgag 


22440 


cacggatgta 


tactactcga 


ctagccccca 


cggctactgg 


ctatggctaa 


atgtgaatgt 


22500 


cgtacgcata 


catgctatgg 


ccgcttcagt 


gcatgtgtta 


acttaggggg 


ccaaacgata 


22560 


15 tcagcctgaa 


cgtgggccaa 


tctattttct 


tcccttggaa 


tggagtgacc 


tcgcataaga 


22620 


cgtacatatt 


gtaactcagc 


ttagacataa 


ccatgctttt 


ctccacaaaa 


ggtctgcaca 


22680 


gatatcttgc 


tgaatgctta 


acgaacctcc 


ttaatgccag 


aaataaaccc 


gaaggtgtcg 


22 740 


tcgcctttcg 


cactcttatt 


ccaccaaata 


cacactacag 


ctgtcaaagc 


aacactacaa 


22 8 00 


atacacaacc 


ggagggtctc 


gatgcccaca 


gtaccctcat 


tctcgggttg 


aattacgact 


2 2 860 


2 0 ttgaaacgta 


ccctttagga 


ttcaccaaaa 


taacacaaag 


agacccaaca 


ccctcgacta 


22 92 0 


gacgataaca 


ttttgtaccc 


ctcgtacccg 


cagcatctat 


cccttacttc 


tcttctccgt 


22 980 


accctgactc 


caggttctgc 


actaccttca 


tgaacaaaac 


agcaggagag 


agaagctcga 


23040 


aaagcactcc 


cagttgaccc 


aaaattcttt 


gaacttacaa 


atgcgacatg 


gctcagtgga 


23100 


caatagtagg 


aatggtgaaa 


aatgcgcggt 


gtctggagct 


agaccgggaa 


tcataacctt 


23160 


2 5gct agaagat 


ttaccgaagc 


agtatcggrt: 


gtatagtggt 


aaggataaaa 


cgctactgtg 


2322 0 


tatgaagtta 


gcatcggtgc 


catggccttt 


acaacaagac 


cagcgaacat 


tcatatcatt 


232 8 0 


tcttgatttg 


gctgaaaaat 


accaagtcgt 


tgcgtaacac 


attataccaa 


aactccacat 


23340 


at at cat cat 


gctaataata 


acccaaccca 


gatccaaaat 


cccttacatc 


accatcttaa 


23400 


acgccggatc 


cctatcatct 


cctcgtcctt 


cgaattccgc 


atactttcca 


ctcgccttat 


23460 


3 Ocgcggttgat 


tttccacacc 


cacgccacat 


tcgctgcgac 


cctatcattc 


acgttagctg 


23520 


catatcccgc 


cagcaaaacg 


atggaaatcc 


acctgtacac 


actagaaaag 


agaacaagga 


23580 


gaagagaact 


tacaaaaccc 


aagccgcaca 


caacaacccc 


agcataatgc 


tatgactctt 


23640 


atggaaattc 


ggcttctcgc 


tcttctgata 


cgtaaagctc 


gcgacgaacc 


aacccgcatt 


23700 


ggcaat cgca 


agctgcagcg 


ccgatgtagt 


cgcgcgcttg 


tagtggccag 


cggaattatt 


23 760 


35gctgttccaa 


gaaagaatac 


atgggacgga 


ggaatacatg 


cctgtagcca 


tgagaaatgt 


23820 


cattccgtat 


tgtactttcg 


ggttcgtcga 


ttggctgatg 


actccgtagc 


cgatgatagc 


23880 


aacgggaagg 


gtacacagca 


tgactgggcc 


acgtagagcg 


aggcggtcgg 


agagaatggc 


23940 


tacaaggacg 


gtgaagacgg 


aagcgacggc 


gtaaggaatg 


acggtccaga 


gctgggcttt 


24000 


gttggggtcc 


ttggcgaagc 


cgttgttgat 


gattgtgggg 


aggaagaggc 


cgaaggagta 


24060 


4 Ogaggcctgag 


aggatagaga 


agtaggcagt 


ggctgtgagc 


cagacttgga 


ggttgaagat 


24120 
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ft 4~ 4— ft ft ft 4™ fr ft t -- 

cuugcguggu 
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aaggtgggga 
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tacaaagagc 
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acatagagaa 


acaaaaatgc 


cactactact 


ccatcgtctt 
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gcgacttcaa 


acttcgcatc 


cgtcttcgcc 


ttgacatcct 


caacgctaac 


ccccggcgcc 
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gtctccgtca 


gcgtcaaagt 


ccccctcttc 


ctgtttactt 


caaagacaca 


cagateggta 


26400 


ataatagtgc 


tcacgcactt 


tgctcctgta 


ageggcaact 


ggcattcctg 


aacaatcttg 


26460 


40ctggatccat 


ctttagcaac 


gtgttcagtc 


gegacaaega 


cttttgtagc 


sitcgggsitug 
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P P'/T/T P /1 P 4- p p 

c 99y ccceg c 


gcagatgcgL 


?a p P 4~ rrrr P 4— 4~ P - 

acceggceeg 


■t4~ /TP P ^1 /T ^ /T 

gegccagaag 


aceacccace 


ceegcaeacc 


J -sd / o U 


O 4"" a p a +" a /-r -j— p -a 
^ J LctUdL cty LOd 


a "H r~r4~ i* i" pa h p 


4— p't" p r^i p p 4— <tp 

egecaccegc 


4" it+" p p jt 4— p p 4— 

egeccgecce 


cgacgcgegc 


ft ft ft ft n 4^ 4" ft ft 

ccgciiec egc 


JZ OZU 


> 4- p /t /""i >n 4™ /t/t4~ ^t 

a e g c n e g g c g 


ca eegeeege 


acecceeccg 


ctaggcgcct 


egtatctgea 


tctt ccctgt 


"3 O Q O A 


ffri p 4— p; p p /—1 p 4— 

y L/L- LytyuuL 


gegc eegegc 


p 4~ /■» 1 ■ /tp 0 a 4- p* 

eegeggaaeg 


4" P /T P /TPT P P P (T 

eegeggeeeg 


p 4— /T/T 4~ /t/t ti 4— -n 

eegc egcaea 


ft ft ft 4"~ *a 4~ i^i 4™ it 4™ 

geceaecege 


*3 O Q A A 
j/sy4U 


T /-1 4— ^3 p ^3 Tl ^1 ■— t 

cl C cl L cl C ct cl C a 


^1 /'-i ra /^i f7\ 4~ /^i 

CCaLCCCaUC 


ccgcttcacc 


4— s-% 4-" 4~ /t /^i p 4^ 

egeceegece 


ccc t ectegt 


gccacacatc 


o n n A A 
J JUUU 


/T ft ft fr ft ft ft r?i /~"/ 

cgccgcccac 


1—1 f-% ^> +^ j~er*r 

aaCaCCatyy 


ctgcgaccaa 


ccccgagctg 


caggccaaac 


tgeaggaget 




j uggaccac gag 


ctcgaggagg 


gcgatat tac 


acaaaaaggg 


fceegtae tgc 


tgcac caeca 




j — 1 , — 1 fr /-i p t 4- 1 P P /T 

ccyccaLCcy 


/~i /*■« 4— /1 4- /-« 4- /t p i^*c 

ccececegcg 


+— s~i 4— *-\ — s 4— /-*| 

tgcgccadLC 


agtegcatag 


ctatgaaaaa 


cgtcgcaccg 


"2 "3 "1 O A 


4— p-p 4~ /tp 4~ /t 4— p 

tgetyety lc 


r~>r r~t "3 r 4~ 4~ /~« +" 

ycay LaLCta 


gggecegace 


eegcegccca 


geegcaggee 


gacctgaacc 




a pp a rta a p p p 
dy Uayad.L.L.L 


appppaappa 

deeeeddeed 


eccagegagg 


4~* /^i z*^! 4"* f% 

geececgcec 


ccgcaccgca 


4-« 4-^ 4~* ft ft 4~" *n 

ecceeegcea 


"3. "3 *3 A A 


4- 4- /-1 r^rrt" c*c*ri(~i 
ULuuy uoeyy 


4— prtrrn f-r4- ppa 

e e e y cty i-LLd 


4" P S PT"I (TPP'a P 

Lcacngcgac 


p rrs p -si a p p p p p 

cacaaccccc 


!a p =s 4— t 4— p p ?a rf 

acaeaeccag 


P 4— p p p p P /TP P 

cecccccgcc 


•men 


35ccgactcata 


ccatgacgct 


tccgcacagg 


gccaattggg 


cgcacccatg 


ecatatgega 


33420 


acgcctccgc 


cgctgcctcg 


gggggctege 


agtacatggc 


atacccgccc 


agecaagteg 


33480 


geegttttea 


agagaagcag 


ctgggcctgc 


gtacaaattc 


gctccagcgc 


aattcctcac 


33540 


agetgtcgea 


aggaagegag 


aegttcatte 


cacggcctca 


aacgectgaa 


tacaaccact 


33600 


cgcgcgagcc 


caccatgatg 


ggcaactacg 


ccttcaatcc 


agacaatcag 


caaagttatg 


33660 


4 0atggccaatt 


tggctctccg 


ggagaggeca 


gtcgaaggag 


caccatgctc 


gaggtaaacc 


33720 
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agggttattt 


ttccgacttc 


acaggccagc 


agatgcaaga 


caatcgcgac 


tcgtatgggg 


33780 


gacccaaccg 


ctactcgtcg 


ggagatgcct 


tttctcctac 


cgccgcgatt 


ccacctccca 


33840 


tgatgaaccc 


caacgatctc 


cccttgggcg 


ctgctgaaac 


catgatgccg 


ctagagcccc 


33900 


gcgatctgcc 


ttttgacgtt 


tacgaccctc 


acaaccccaa 


tgtcaaaatg 


tcaaagtttg 


33960 


Sacaacattgg 


cgctgtcttg 


cgtcaccgaa 


gtcgcacaca 


gccaaggacg 


actgccttct 


34020 


gggtccttga 


cgcaaaaggc 


aaagagacgg 


cgtccatcac 


ctgggaaaag 


gtggctagtc 


34080 


gcgcggaaaa 


ggtggccaaa 


gtgattcggg 


acaagagcaa 


cctctatcga 


ggcgaccgtg 


34140 


tggcattagt 


gtacagggat 


acagaaatca 


ttgattttgt 


cgtggcgttg 


atgggctgct 


34200 


tcattgcggg 


cgttgtagcg 


gtacccatca 


atagcgtcga 


cgactaccag 


aaactcattc 


34260 


lOttctcctaac 


gacaactcaa 


gctcatctcg 


cattgaccac 


agacaacaat 


ctcaaggcct 


34320 


ttcatcgtga 


cattagtcag 


aaccgtctga 


aatggccgag 


tggggtagag 


tggtggaaga 


34380 


cgaacgagtt 


tggcagccac 


caccccaaga 


aacatgacga 


tactccagct 


ttgcaagtac 


34440 


cagaggttgc 


ctatattgag 


ttctcgcgtg 


cacctactgg 


tgaccttcgc 


ggtgtggtgc 


34500 


ttagtcaccg 


gactattatg 


caccaaatgg 


cctgcatcag 


tgccatgatt 


agcacgatac 


34560 


ISccaccaacgc 


tcagagccaa 


gacacgttca 


gcactagcct 


acgggatgca 


gagggaaagt 


34620 


tcgttgctcc 


agcaccgtcc 


agaaacccca 


cagaagtgat 


cctcacgtac 


ctcgacccgc 


34680 


gcgaaagcgc 


tggtctcatt 


ctcagtgtct 


tgtttgcagt 


ttatggaggc 


cacaccaccg 


34740 


tatggctcga 


gacagcgacc 


atggaaaccc 


cgggtctata 


tgcacatctc 


atcaccaaat 


34800 


acaagtccaa 


catactgcta 


gcggattacc 


caggcctcaa 


gcgcgctgca 


tacaactacc 


34860 


2 0aacaggatcc 


aatggctaca 


agaaacttca 


agaaaaacac 


agaacccaac 


ttcgcctccg 


34920 


tgaagatctg 


tctgattgac 


acgcttaccg 


tcgactgtga 


atttcacgaa 


attctcggag 


34980 


atcgatattt 


caggccactg 


cgaaacccta 


gagcgcgaga 


actgatcgcg 


ccaatgctct 


35040 


gcttgccaga 


acatggtgga 


atgataatat 


ctgtacgcga 


ctggctaggt 


ggagaggagc 


35100 


gcatgggctg 


cccgctaagc 


atagcagtag 


aagagtcaga 


taatgatgaa 


gatgatacag 


35160 


2 5aggataagta 


tgcagcggca 


aatggctact 


ccagtcttat 


tggtggtggc 


actacaaaga 


35220 


acaaaaagga 


gaagaagaag 


aaaggcccga 


cagagcttac 


agaaatcttg 


ctggacaagg 


35280 


aagctctgaa 


gatgaacgaa 


gtcattgttc 


tggccattgg 


agaagaagca 


agcaagcggg 


35340 


caaacgagcc 


cggcaccatg 


cgagtcggtg 


cctttggata 


ccccataccg 


gatgcgacac 


35400 


tagctattgt 


agaccctgag 


acaagtcttc 


tatgttcacc 


atactcgata 


ggcgagatct 


35460 


3 0gggtagattc 


gccttcactc 


tctggtggct 


tctggcagct 


gcagaagcat 


acagagacca 


35520 


ttttccatgc 


tcgaccatac 


cgtttcgttg 


agggtagccc 


tacgccacag 


ttgcttgaac 


35580 


tcgagtttct 


gcgtactgga 


ctcctcggct 


ttgttgtaga 


gggaaaaata 


tttgtccttg 


35640 


gactgtacga 


agatcgcatc 


agacagcgtg 


ttgaatgggt 


agaaaatggt 


cagcttgaag 


35700 


ccgagcatcg 


atactttttt 


gtgcagcacc 


tggtcacaag 


cattatgaag 


gccgtgccaa 


35760 


35aaatttacga 


ctggtaagtg 


agctgccaac 


agagcaagga 


ctgtctaacg 


tgtcatagct 


35820 


cgtcgtttga 


ttcttatgta 


aatggtgaat 


acctgccaat 


cattctcatc 


gagacgcagg 


35880 


ccgcatcgac 


tgcgcccaca 


aacccaggtg 


gaccaccaca 


acaattggat 


ataccatttt 


35940 


tggattcact 


atctgagagg 


tgcatggagg 


tcctttacca 


agagcatcat 


ttacgggtat 


36000 


actgcgtgat 


gattacagca 


cctaatacac 


ttccacgagt 


catcaagaac 


ggacggcgag 


36060 


4 0aaattggcaa 


tatgctgtgt 


aggagagagt 


ttgacaatgg 


ctctctgccc 


tgtgtacacg 


36120 
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taaagfc fc fcgg 


cafcfcgagcga 


.fccagfcgcaga 


acafcfcgcgcfc 


eggtgacgat 


cccgcfcggcg 


3 618 0 


gcatgtggtc 


afcfctgaggca 


fccaafcggcac 


gt cagcaafct 


ct tgatgc tc 


caagacaagc 


3 624 0 


aat ac tctgg 


tgtcgatcafc 


cgcgaagfccg 


tcattgacga 


caggacatcg 


actccacfcca 


3 6300 


m 4^ y— t * y— *- 4— X- 4— 

aLcagLuCuc 


gaafcafcccac 


gacc tgatgc 


aatggcgtgt 


at ct eggcag 


gc cgaggaac 


-~) /~ -) s~ r\ 


C f- 4- t~% 4-* 4~ ^ 


cacugucgac 


ggucgaggaa 


aagagggcaa 


aggegt caat 


tggaagaagt 


3 o4z U 


4— 4— /~r ra ?5 a 

ttyaLcaaaa 


gguugcgggc 


yuagcaa uyt 


accfccaagaa 


caaggtcaag 


gtccaggccg 




gcgaccatcu 


cctiuc cgaug 


fcacacgcafct 


cagaagaatfc 


u gr. t: u a tigc u 


gtt catgeat 


O (Z C A C\ 


y CCLty LyCL 


i- ggagc u gr u. 


tgcafcaccaa 


tggcgccaat 


tgatcagaac 


cggttgaatg 


iC c\ r\ 


agyatycgcc 


r-4 yi 4^ 4^ jt /"n 4" 

ggccLtgccg 


/— i *-v 4^ 4"" /~t r*\ +■ 4" 

caLauccLug 


cagatt tcaa 


ggt caaagee 


atfccttgfc ca 


*3 tz a c n 


luacycuyacyt 


fcgaccafccfcg 


atgaagat ca 


agcaagfcat c 


gcagcacatc 


aaacaategg 


3 o / 2, U 


ccgctat cct 


caagafccagt 


gfcgccaaaca 


cafcacagcac 


aacaaagecg 


ccaaagcaat 


~> r~ <~7 o r\ 


ccagtggctg 


ccgcgaccfcc 


aagcfcfcacaa 


tfccgaccggc 


atggattcag 


gcgggtttcc 


o iz o a n 


cagfcgctagt 


cfcggacafcac 


fcggacgcccg 


afccaacgfccg 


tategcagtt 


cagctgggcc 




aLayccaaat 


cafcggcacfcg 


tgcaaggfccc 


aaaaagaaac 


atgccaaatg 


acaagtacac 


o d y o u 


i3yaccaytccc 


^ggtuguguc 


cggagcacga 


taggacfctgg 


ttt cctt cac 


acttgtct ca 


o / uz u 


ugggaaucux 


ccfcfcgccgca 


cccacatacc 


tggfcgfccacc 


tgttgacttt 


gcacaaaacc 


tl n n n n 




gn uccaaacg 


cfcfc t cgcggfc 


acaagafccaa 


ggafcgcatat 


gcaacgagtc 


o / ±4 U 


aaafcgttgga 


ccacgccafcc 


gcacgcggag 


ctggfcaagag 


tatggctctg 


cacgagctga 


3 / ^2 U U 


ayaaccucat 


gafcfcgcgact 


gatggaagac 


cacgcgtfcga 


fcgtt tgtaag 


tgaacat ttg 


"3 "7 O /C n 


<s utatyayayya 


/—\ 4— +- +- /-*t t\ 4" ^ 

cuuicaugau 


fcgctaacfcca 


atgcagacca 


aagagtgcgt 


gtgeact ttg 


*3 "~7 *3 o n 
O / JZU 


cgccagccaa 


cfc tagaccca 


accgcaat ca 


acactgfccta 


cfccacafcgfca 


ttgaacccaa 


*3 T "3 o n 


uyy uaycaLC 


acgatcatac 


atgfcgfcatfcg 


agccagtcga 


gctccatcfcc 


jgatgtgcatg 


o /44U 


CtCcyCyacy 


cggccccgcc 


afcgcccgfcfcg 


accccyacac 


agagcccaac 


gc t t fcgeteg 


o /oUU 


tccaagacuc 


gggcatggtg 


ccagfcgagca 


cgcaaafcatc 


cattgtcaac 


ccagagacca 


o /boU 


z oaccaacuyLy 


Cttyaacyyc 


gagfcacggcg 


agate fcgggfc 


gcagtccgag 


gegaatgett 


O / b^J U 


atagcttcta 


catgtcgaaa 


gagcgcttgg 


atgcagaacg 


cttcaatggg 


aggacgattg 


3 7 680 


acggagaccc 


aaatgtgcga 


fcatgtfccgfca 


caggcgatfct 


agga u t t t t g 


cacagcgtga 


3 7740 


eacggcccat 


fcggacccaac 


ggfcgcacctg 


tfcgatatgca 


ggtgcfctttc 


gtgcttggaa 


3 7 80 0 


gcataggtga 


cacfctfctgaa 


gtcaacggac 


tgaaccafcfct 


ctctatggac 


attgagcagt 


3 7 860 


3 Octgfctgaacg 


ttgtcaccgg 


aatattgtcc 


cfcggaggctg 


gtacgtttct 


tegattcgefc 


~} r~i c\ *~\ r\ 

3 7 92 0 


gttatttagt 


aaatacttac 


taacactcta 


cagtgcfcgfcfc 


ttccaggcag 




3 7 980 


ugt ugucg u u 


gtggaaatcfc 


tccgacgcaa 


ctfccct cgea 


agcatggtgc 


cfcgtgattgt 


J O U4U 


caaugcaaux 


t fcgaacgagc 


afccagcfcggfc 


cat tgacafct 


gfccfccgfcttg 


tgcaaaaggg 


3 8 10 0 


✓-i /•*■«- 4— 4-- /-i r~\ --3 

cyacuLCCaC 


cgguc ucguc 


tgggcgagaa 


geaacgegga 


aagau.ee utg 


c aggaugggu 


oolbl) 


3 5cacacggaag 


atgcgcacaa 


tagcccagta 


cagtataegg 


gatcctaatg 


gacaggattc 


38220 


ccagatgatc 


acggaagagc 


ctggtccacg 


ggctagcatg 


actggaagta 


tgcttgggcg 


38280 


aatgggcggc 


ccagccagta 


tcaaggccgg 


gtcgacaaga 


gcaccgagtc 


taatgggcat 


38340 


gacagcgact 


atgaataatc 


tatcccttac 


acagcagcaa 


cagcagcaat 


accaacagcc 


38400 


gggtatgtat 


gctcaacagc 


aaggcatgca 


cccccagcaa 


caacaccaat 


ttagcatgtc 


38460 


40caacacgcca 


ccacaaggtc 


caccccaagg 


cgtagaacta 


catgatccta 


gcgaccgcac 


38520 
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accaacayac 


aaccggcacu 


4~ 4*" 4"* n /^i 4— 4*" 

ctuuccctgc 


cgacccgcgt 


4™ ft ft *m /~~r ra ft ft 

augcagaacc 


ay yy ccaaau 






ggcgcctacg 


aacccatgaa 


ctatcaaaac 


gegtat cat c 


f-% f—f ^ +- /«i <«i »tn /^i ^> 

cgcaucaaca 




acaatacgaa 


tctgaagacg 


99999 a 9 ca 9 


act cagegge 


cccgtgccag 


acgtgctgcg 


JO / UU 


gccgggtccfc 


fccatccgggfc 


ccatagagca 


gcacgaccaa 


gctaacaacg 




O C) / o U 


5gtggaataat 


cgegagtact 


atggtaacag 


cccaLugtac 


geaggeggat 


/^t ^ ft ft ft »3 /t 

acacycaaga 




tggcaatatc 


cacgagcagc 


aacaacacga 


4— /^*y ^ 4* *n m f^A f** 

tgagtacacg 


/-r 4— i >3 +■ ft ft fr^~ 

aguaaugegu 


/~» 4— 4^ ft ft ft ft ft 

cauauggegg 


J O O O \J 


aaat caagga 


geaggeggag 


geageggegg 


egguggegge 


cuccgaguug 


caaaucguga 


O O Q A O 


cagct ccgac 


a 9 c 9 a 999 u 9 


caga tgacga 


cgc ttggaga 


cgugaugccc 


4— 4— nfrta r~r ?a 4~ 

tuyuccagau 


JyUUU 




gg c 99 c 9 CT --9 


4— r** f*% 4— /^r 4" 

cugcLgccuc 


cgc uggagca 


/~» /~i 4— ft ft +~ ft ft 4— /—r 

ccugcuycug 


r~r+~ ft ft 4- 4- 4~ 4~ r~* 




T O 4— 4" /"*« /—* ^ ft /*"* /■"< <-^r 

j_ ULLcgcayccg 


ggccatgcyc 


ay t. a c y y y 


dLdLyuy uyd 


yuuuuuuuuu 


aad. L U LUy U d 


"3 Q1 O A 


catagagacc 


gttgtatacg 


cagguuucaa 


attagaayag 


ft ft 4*" o 4™ ft **i 

egaacatyed 


+" o 4" a ft rt 4- /^>t+- 
Ld tudy LLy L 


•3Q1 pn 

O X O \J 


tgtt caatgfc 


tctagtttgg 


gaaggtt aac 


ccccccccct 


uccccuucca 


— i ^ /^i 4~ 4~ 4~ 4~ ^ 

agacuuuuca 




^-1 4— 4— 4— 4— 4- ~ 

cttgtttgcg 


tgtgatttaa 


at ctggagat 


4™* 4~" •— * -^i 4-" 4— 

tL.caaau.cua 


*~\ 4^ 4™ /—i 4~ 

caucucgcua 


4— — i ^ n f- ^ r~~r f* 4— /^r 

uacataggug 


o q o n n 


4- 4- 4- _ 


acguaggggg 


cagaagggta 


4— ^r-i 4— f~\ +- — f — 4— — 

LCLCyCydtd 


4~ 4" ^ /"y a /~* 4~ ft ft ft 

LL.agact.ggy 


a +/■ 4~ r~t ft a 4— na 

ay u uy LdLy d 


o ezn 

J «7 »J O \J 


xoaucaaggugu 


fcgagcaaaaa 


aagagagagc 


ggtgaagggc 


ft ft ft ft ft ft ft a 4— a 

9999999 ata 


r^r/^fl - /^t/^y 4— /— ^ 4~ ft ft 

99^99^9^9° 


Q A o n 


acguggcugg 


gcguauagcg 


aaaagageca 


aaggaatydt 


f""r r^i ^ a /-i ft ft ft a 

gaeaegggae 


ft ft f* a ft a rra a r~r 

yyududyddy 


O -7 *± O \J 




/T *3 4™ /—i /~r r\ o is 

gauauyaaac 


dtydc aacgc 


LU dy U Uy ay 


ca 999 t 9 i:: 99 


i*i 4— prra a a a f~fi~ 
L* LL-yaclaay U 


"3 QCJ4 n 

O O TC U 


gtaggtacat 


aeggaegtafc 


gtagatatgt 


4— « 4— /—a f% /— 1 — ^ 4— -f— ji»f 

uaucccau. ug 


g-*\ gr^% ^ 4~~ 4~ /*rri 4*" /•"I 

ccauugcauc 


/~« 4™ 4- ft 4- ft *r\ 4— 


j"DUU 


gatac taacg 


tgactggacg 


aga t aagt gt 


4™ /i 4^ y4~ /*i y^i r?% 

cue ugucgea 


ft ft ft m 4" "^i 4*" /^i 4™ 

cgcaauaucu 


s /"^ ft ft a 4"nf" /-i *-\ 

acgcjauccca 




*s uugacaau eta 


ggcgccaagt 


fcaat tgtgc t 


gcccccacgc 


4~ ft ft ft ft ft ft 

uucagccgcg 


^ ft ' — 1 ft ft ft ft ft 

aaccayyegg 


3Q79n 


aegggataag 


gaagggagaa 


cccgt cfctga 


a u uaa tdcc u 


tt cct tggac 


uauaagaucu 


3QTfin 

J" / ou 


agaatgttcg 


gaagatagtc 


gcccaacggt 


cgacagcaag 


4f"*r ""^ 4^ i*~r 4~ 4™ rr\ j^y 

gauguaatag 


4— /— f 4-« ^\ 4- 4- "a /~r 

uag uauuaag 


J O t: U 


cgaacgaaag 


auuugcagag 


aaactcaaga 


aaaaggcgga 


gaaggaagaa 


!is ^ a a ;a ft 4— 4~ 
ddddddydL U 


3 qqnn 

O J? -7 u w 


gaaagcacau 


— i 4"* 4* s~*cs~i 4~ 4" 4"* q 

acguggutua 


ecaagaeggg 


augtatdLLg 


^ a /-i /-« a a +~ ft 


4— rr4~ aa+"r(f"l"n 

uy uaauy u uy 






ugugauguuc 


ataauuggct 


uggcaatcta 


CaCaCyaCay 


ft a 4- /va r*< a a ft ft 
LdLLjaUa cty u. 


*± U U 4 u 


tgaagggcaa 


aaaagcttga 


tgcgtgrugg 


tacaegguge 


^ ft » ^ 4— 4— ^ 

agagauugca 


^ ^ fr +— /->r4~ 4~ 4- 

aaagugu uuc 


1UUOU 


atgtatggaa 


ggttcctcca 




tgecaaggae 


99 a 9 a "t c 99t 


ft ft *n /— * 4— — % — \ 

ggagcugaag 




4- 4- ~4. — ,~4- J- 4— 

ttagtggttt 


catgtggaaa 


acaggcccgg 


a 9 a 9t c 99t9 


_ j- 4- 4- 4- 4- 4- — 

ggcuuuuuua 


^/-y-4- 4- 4- 4- 4- 4- 4- ft 

gguuuuu uuc 




etttgeaaaa 


atcagttagt 


gaagggegag 


agegggecag 


atttcaggtc 


ggt eggtt eg 


a n q ^ n 


3 Oatgaccgact 


gggacgccaa 


c 999tgtcac 


tgcggatacg 


tat cat ttga 


ucguggggac 


t: U J Z U 


ctgggagggc 


tgtgtggctt 


*—T*~T s-r 4-~ 4-^ 4— 4— « 4— 

ggagttttcc 


ft ft ft t* m ft ft 4™ /—i 

cggcaagctc 


^» 4-~ f~* r~t f~t i~t i^c/^r 

atdCyCCCyy 


4^ /^i ^ ft ft ft ^ 

u c_- <— cj ciy y u» ay 


a n *^ r n 

*t U J O u 


4— r^r ~i 4- 4-^ /— r4— /^i 

agugauuguc 


tgggtcgtgg 


uctcgcaucg 


ft ft ft ft ■n 4— /"i ^ /-T+- 

9999 aL - ca 9t 


gcaggcacag 


gcguatgLdu 


a n a a n 

U ^i: ^fc U 


ugi-gaatcga 


agccEcgtgL 


catgcgcgai. 


ftftT^ ffi— ^ «~f«~r/ - r+" 

99 a 9 u a 999 


/-* /T ^3 ft ft ft ^ 4- 

99 a 999 a 9 at: 


rTrTr^fT/~r4~ rra ft ft 

gy cgy ugacg 


*± U 3 U U 


atgacgafcag 


tggtgatgfcc 


gcagctaaca 


gugaut.gu.aa 


ggcauuaaag 


ft ft ft ft /— i ^a +~ /~i /-f 

ccggcauy eg 


A O c; (? n 


35gacaatgctt 


aegtagtagt 


gtacacaaga 


gcatgtgctg 


caataagctt 


tgtgctagat 


40620 


ttggagtgag 


gatgeccttg 


gacatggaag 


cgtgtcgctg 


tcaatgtgtt 


acaccaaatg 


40680 


atagegcatg 


gcggaggaag 


cgccacactc 


taaaaccatg 


cattgaagaa 


ccgagacgtg 


40740 


agcgggtccc 


aaggtctggc 


ggaaatgaca 


taggtggaac 


cgcaatgtga 


aggattcgac 


40800 


gactattcca 


ttttttccag 


ttactgcgtc 


gaattttggc 


aaatgtcgac 


gagctgaagc 


40860 


40atggcttgtg 


gaggactacc 


agaagegtta 


tgccgcctgg 


cagggcacac 


actattgttt 


40920 
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f"T fa. /"* ra /"*T ^1 ft ft ft 4— 


ft ft ft ft ft ft /~i fa fa ft 

ggggcccaa_y 


ft f\ ft ft ft ft ft ft ft 

CaiiycyycyC 


atgaacaaac 


gtttcaacgt 


tat cgtgaag 


a r\ q q n 


T— /^r *a. ft ft /"t /at *a 

tgagccgagt 


cgcggcagaa 


99 a 9 a 9cga.c 


actccggcgc 


tggtgtttfcg 


aaactggc tc 




dCaCCgaagc 


aagcaccacg 


gtcaagcgat 


gcagttgagg 


caagcggcgc 


aagcgggacg 


411UU 


cgatggccgc 


atggatgcag 


ccagacgggc 


agaagagcgg 


acagt cage a 


ggacatgt fcfc 




DLugccttgLu 


tcy ULLCyCL 


4™ ft ft ft ft 4— 4™ f*r*€ 


gugrgaggau 


»—i /^r i^i 4— m /"f ra 4^ 4— 

yayCLdyaLL 


/~«r 4-^ rTf** f \ 4-^ -a *t4— ra 

guggcragca 




4— ffff /T/t4— ^ 4"~ 


CLL.yca.yyLu 


4— --\ 4— 4 — ft ft ft 4~ <—r 

LaLLyCCLdy 


aagac u uggu 


aLLCCyayyL 


4™* fr ra ra ft ft ft *a 

ugaacgy cag 




2 S ft ft ^ O a 4— O /T 

aagcaaatay 


*t+~ ft ft ft ft ft ft ft 

y Ley yacccg 


■a /~r t" ft ft 4— ftf t fti— 

a 9 c 9 c ggg l 


Si /*T a ft ^ a ft ft ft ft 

agacaagggg 


-rr. « = 4— 4— 4— ft ft rt 

aCaaLLLCCC 


/~i /~« m «3 4~ 4— ft ft ft 

CCa.yaLL.yyc 


4t ± O *± u 


uaty t uy l. li_ 


yLcyaydayy 


ft ft rrs 4-4"«a ft ft 
ycyaLLCaCy 


a a s a rra ft ft ft ft 

aaaayayyyc 


ft 1 1 +~ ft ft ft ft ft fr 4~ 

yLLyycccyu 


ft ft d ft a ft f* ft ft ft 
cyayac uygg 


A 1 i fin 


f-i ft ft ft r<2rff" r*ft~ 


ft ft ft ft ft ft ft ~i ft 

yy cggcgaag 


fx ft ft ft ft fr 4— rtftft 

9999 c 9^ggg 


4— f~* ft ft ft ft ft ft Ti 

L-gagcggcca 


/"T i^T ft ft ft ft "^i /^r 

gagagggcag 


4— ftftfti- 4— fx \~~ ft ■a 

^999 U U 9 t 9 a 


a i A^n 

4t D U 


"1 firaarrr'apaarT 
J_ U LaaLj LclLclci.y 


CI S3 /T +- ft ft ft ft ft 

cacy ugcggg 


4— ft ft ft ft \~ ft ft ft ft 

c 9999 c 9999 


ft ft ft -3 in 4~ 4~ /-t4~ rr 

gggaauuyty 


r~r /"i ft ft ft ft ft ft m 

geaggggega 


ft ft /-T+- ft ft ft ft ft 4— 

ccyLyccycL 


A1 CTn 


*t j*** 

gcgaaCCady 


f% ft ft ft f% f-% if— \ +^ 

cgcgcgca ll 


/aT/ar4™* 4™ 1 fa /at 4~ /^r 4-» 

ggrugacugu 


ggraugcacg 


ra ra *a ^**r /*< ^*t+ -1 ^ 4-* 

aagagcgt.au 


acac uagcag 


Ai con 

4J.OOU 


1^ f-\ f-*r ^* « 4— 

Caayayaatg 


fa /"a* ft /ar i^/Y/^ fa 

cay cgc cgca 


gggtagtaag 


catagggcgg 


cgacggcgcc 


t cgtggcaag 


A 1 CA f\ 


ta-999 ac 9 a 9 


ggctgttgag 


tggtgcaggt 


atgctggtat 


gctagtagta 


gctcctacgt 




3 ft ft ft <t4— ft ft ft ft 

aggcgLggcc 


<t4-"/-t4— ?a /Tr-r4— ft ft 

gtguaggugc 


<*"f 4- ft ft ft ft r~t t*i ti 

g ugy cgcaay 


/— 1 4— ft ft 4— ft ft ftft¥- 

CLyCLyycyL 


ftfti— 4- ft ft 4— ft ft ft 

gg u u g c u gg c 


ft 4— ft ft +- ft ft ft ft ft 

cugc uggccg 


ai T^n 

41/OU 


IBggttgctggc 


ctggctgccg 


tcttgcacaa 


ggcaaatgca 


tagagtegtg 


cccagcgccg 


41820 


gctttcggcc 


ttggtagtgc 


actgggcgtg 


tgaatagctg 


tcagcacgcc 


cgctggcggt 


41880 


tcgcgccatg 


gtggagattt 


tgcacgcgac 


atggacgacg 


acggcctcgg 


cagegtgagg 


41940 


aacatgtcaa 


aatgaaccca 


9999t9catc 


aaagccgttt 


tacctgaaca 


gatgagtgcg 


42000 


atctctgccg 


ggatgcgtga 


tgaagttgac 


tcgcttggac 


gacggtttgg 


gggcaggcta 


42060 


20gagccgcaca 


tgtcatcggc 


cgggcatggc 


gtcggggcct 


gcacagttcc 


tgcag 


42115 



<210> 60 

25<211> 9 

<212> PRT 

<213> Artificial Sequence 
<220> 

3 0<22 3> Cyclization domain motif 



<221> SITE 
<222> 6 

<223> Xaa = Asp or Glu 

<221> SITE 
<222> 9 

<223> Xaa = Ser or Ala 



40<221> SITE 
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<222> 2-5, 7-8 

<223> Xaa = any amino acid 

<400> 60 

5Asp Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
1 5 

<210> 61 
<211> 15 
10<212> PRT 

<213> Cochliobolus heterostrophus 

<400> 61 

Cys Phe lie Ala Gly Val Val Ala Val Pro lie Asn Ser Val Asp 
15 1 5 10 15 

<210> 62 
<211> 15 
<212> PRT 
20<213> Cochliobolus heterostrophus 

<400> 62 

Cys Phe Val Leu Gly Ala Val Cys lie Pro Met Ala Pro lie Asp 
15 10 15 

25 

<210> 63 
<211> 15 
<212> PRT 

<213> Myxococcus xanthus 

3 0 

<400> 63 

Cys Leu Tyr Ala Gly Val Val Ala Val Pro Val Tyr Pro Pro Asp 
15 10 15 

35<210> 64 
<211> 14 
<212> PRT 

<213> Bacillus brevis 



40<400> 64 
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Val Leu Lys Ala Gly Gly Tyr Val Pro lie Asp He Glu Tyr 
15 10 

<210> 65 
5<211> 15 
<212> PRT 

<213> Cochliobolus carbonum 
<400> 65 

lOIle Leu Lys Ala Gly Gly Val Cys Val Pro He Asp Pro Arg Tyr 
15 10 15 

<210> 66 
<211> 15 
15<212> PRT 

<213> Cochliobolus carbonum 

<400> 66 

Val Val Gin Ala Gly Gly Val Phe Val Leu Leu Glu Pro Gly His 
20 1 5 10 15 

<210> 67 
<211> 15 
<212> PRT 
25<213> Fusarium scirpi 

<400> 67 

Val Leu Lys Ala Gly His Ala Phe Thr Leu He Asp Pro Ser Asp 
15 10 15 

30 

<210> 68 
<211> 15 
<212> PRT 

<213> Fusarium scirpi 

35 

<400> 68 

He Leu Lys Ala Asn Leu Ala Tyr Leu Pro Leu Asp Val Arg Ser 
15 10 15 



40<210> 69 
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<211> 15 

<212> PRT 

<213> Aspergillus 



79 



nidulans 



5<400> 69 

Val Trp Lys Ser Gly Ala Ala Tyr Val Pro lie Asp Pro Thr Tyr 
15 10 15 



<210> 70 
10<211> 15 
<212> PRT 

<213> Aspergillus nidulans 



<400> 70 

15Val Trp Lys Ser Gly Gly Ala Tyr Val Pro lie Asp Pro Gly Tyr 
15 10 15 



<210> 71 
<211> 15 
20<212> PRT 

<213> Tolypocladium nivenm 



<400> 71 

lie Leu Lys Ala His Leu Ala Tyr Leu Pro Leu Asp He Asn Val 
25 1 5 10 15 



<210> 72 
<211> 15 
<212> PRT 
30<213> Tolypocladium nivenm 



<400> 72 

He Leu Lys Ala Gly His Ala Tyr Leu Pro Leu Asp Val Asn Val 
15 10 15 

<210> 73 
<211> 15 
<212> PRT 

<213> Artificial Sequence 
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<220> 

<22 3> Consensus sequence 

<221> SITE 
5<222> 1, 6-8, 14-15 
<223> Xaa = any amino acid 

<400> 73 

Xaa Leu Lys Ala Gly Xaa Xaa Xaa Val Pro lie Asp Pro Xaa Xaa 
10 1 5 10 15 

<210> 74 
<211> 19 
<212> PRT 
15<213> Artificial Sequence 

<220> 

<223> Consensus sequence 

20<221> SITE 

<222> 5, 8, 13-15, 18 

<22 3> Xaa - any amino acid 

<400> 74 

25Phe Thr Ser Gly Xaa Thr Gly Xaa Pro Lys Gly Val Xaa Xaa Xaa His 
15 10 15 

Arg Xaa He 



<211> 19 
<212> PRT 

<213> Cochliobolus heterostrophus 
35<400> 75 

Phe Ser Arg Ala Pro Thr Gly Asp Leu Arg Gly Val Val Leu Ser His 

15 10 15 

Arg Thr He 



40 
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<210> 76 
<211> ,18 
<212> PRT 

<213> Cochliobolus heterostrophus 

5 

<400> 76 

Trp Thr Tyr Trp Thr Pro Asp Gin Arg Ala Val Gin Leu Gly His Ser 

15 10 15 

Gin He 

10 

<210> 77 
<211> 19 
<212> PRT 
15<213> Myxococcus xanthus 

<400> 77 

Tyr Thr Ser Gly Ser Thr Ala Asp Pro Lys Gly Val Val Leu Thr His 
15 10 15 

2 0Arg Asn Leu 

<210> 78 
<211> 19 
25<212> PRT 

<213> Bacillus brevis 

<400> 78 

Tyr Thr Ser Gly Thr Thr Gly Asn Pro Lys Gly Thr Met Leu Glu His 
30 1 5 10 15 

Lys Gly He 

<210> 79 
35<211> 19 
<212> PRT 

<213> Cochliobolus carbonum 
<400> 79 

4 0Phe Thr Ser Gly Ser Thr Gly Val Pro Lys Cys He Val Val Thr His 
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15 10 15 

Ser Gin lie 



5<210> 80 

<211> 18 

<212> PRT 

<213> Cochliobolus carbonum 



10<400> 80 

Phe Thr Ser Gly Thr Gly Val Pro Lys Gly Ala Val Ala Thr His Gin 

15 10 15 

Ala Tyr 

15 

<210> 81 
<211> 19 
<212> PRT 

<213> Fusarium scirpi 

20 

<400> 81 

Phe Thr Ser Gly Ser Thr Gly He Pro Lys Gly He Met He Glu His 

15 10 15 

Arg Ser Phe 



<210> 82 
<211> 19 
<212> PRT 
30<213> Fusarium scirpi 

<400> 82 

Phe Thr Ser Gly Ser Thr Gly Lys Pro Lys Gly Val Met He Glu His 
15 10 15 

3 5Arg Ala He 



<210> 83 
<211> 19 
40<212> PRT 
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<213> Aspergillus nidulans 
<400> 83 

Tyr Thr Ser Gly Thr Thr Gly Phe Pro Lys Gly lie Phe Lys Gin His 
5 1 5 10 15 

Thr Asn Val 



<210> 84 
10<211> 19 
<212> PRT 

<213> Aspergillus nidulans 
<400> 84 

15Tyr Thr Ser Gly Thr Thr Gly Arg Pro Lys Gly Val Thr Val Glu His 
15 10 15 

His Gly Val 



20<210> 85 
<211> 19 
<212> PRT 

<213> Tolypocladium nivenm 



25<400> 85 

Phe Thr Ser Gly Ser Thr Gly Lys Pro Lys Gly Val Met He Glu His 

15 10 15 

Arg Gly He 

30 

<210> 86 
<211> 19 
<212> PRT 

<213> Tolypocladium nivenm 

35 

<400> 86 

Phe Thr Ser Gly Ser Thr Gly Lys Pro Lys Gly Val Met He Glu His 

1 5 i 10 15 

Arg Gly Val 
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<210> 87 

<211> 14 

<212> PRT 

<213> Artificial Sequence 

5 

<220> 

<223> Consensus sequence 



<221> SITE 
10<222> 4, 6, 8 

<223> Xaa = any amino acid 



<400> 87 

Gly Glu Leu Xaa Val Xaa Gly Xaa Gly Leu Ala Arg Gly Tyr 
15 1 5 10 



<210> 88 
<211> 14 
<212> PRT 
20<213> Cochliobolus heterostrophus 



<400> 88 

Gly Glu lie Trp Val Asp Ser Pro Ser Leu Ser Gly Gly Phe 
15 10 

<210> 89 
<211> 14 
<212> PRT 

<213> Cochliobolus heterostrophus 
<400> 89 

Gly Glu lie Trp Val Gin Ser Glu Ala Asn Ala Tyr Ser Phe 
15 10 



35<210> 90 
<211> 14 
<212> PRT 

<213> Myxo coccus xanthus 



40<400> 90 
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Gly Glu lie Trp Val Arg Gly Pro Ser Val Ala Gin Gly Tyr 
15 10 

<210> 91 
5<211> 14 
<212> PRT 

<213> Bacillus brevis 
<400> 91 

lOGly Glu Leu Cys lie Gly Gly Glu Gly Leu Ala Arg Gly Tyr 
15 10 

<210> 92 
<211> 14 
15<212> PRT 

<213> Cochliobolus carbonum 

<400> 92 

Gly Glu Leu Leu lie Glu Ser Gly His Leu Ala Asp Lys Tyr 
20 1 5 10 

<210> 93 
<211> 14 
<212> PRT 
2 5<213> Cochliobolus carbonum 

<400> 93 

Gly Glu Leu lie lie Glu Gly Ser lie Leu Cys Arg Gly Tyr 
1 5 10 

30 

<210> 94 
<211> 14 
<212> PRT 

<213> Fusarium scirpi 

35 

<400> 94 

Gly Glu Leu Val lie Glu Ser Ala Gly lie Ala Arg Asp Tyr 
15 10 



40<210> 95 
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<211> 14 
<212> PRT 

<213> Fusarium scirpi 



5<400> 95 

Gly Glu Leu Val Val Thr Gly Asp Gly Val Gly Arg Gly Tyr 
15 10 



<210> 96 
10<211> 14 
<212> PRT 

<213> Aspergillus nidulans 



<400> 96 

15Gly Glu Leu His lie Gly Gly Leu Gly lie Ser Lys Gly Tyr 
15 10 



<210> 97 
<211> 14 
20<212> PRT 

<213> Aspergillus nidulans 



<400> 97 

Gly Glu Leu Tyr Leu Gly Gly Glu Gly Val Val Arg Gly Tyr 
25 1 5 10 



<210> 98 
<211> 14 
<212> PRT 
3 0<213> Tolypocladium nivenra 



<400> 98 

Gly Glu Leu Val Val Ser Gly Asp Gly Leu Ala Arg Gly Tyr 
15 10 

<210> 99 
<211> 14 
<212> PRT 

<213> Tolypocladium nivenm 
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Gly Glu Leu Val Val Thr Gly Asp Gly Leu Ala Arg Gly Tyr 
15 10 



5<210> 100 
<211> 8 
<212> PRT 

<213> Artificial Sequence 



10<220> 

<223> Consensus sequence 

<221> SITE 
<222> 7 

15<223> Xaa = any amino acid 
<400> 100 

Tyr Arg Thr Gly Asp Leu Xaa Arg 
1 5 

20 

<210> 101 
<211> 9 
<212> PRT 

<213> Cochliobolus heterostrophus 

25 

<400> 101 

Phe Leu Arg Thr Gly Leu Leu Gly Phe 
1 5 



30<210> 102 
<211> 9 
<212> PRT 

<213> Cochliobolus heterostrophus 

35<400> 102 

Tyr Val Arg Thr Gly Asp Leu Gly Phe 
1 5 



<210> 103 
40<211> 9 
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<212> PRT 

<213> Myxococcus xanthus 
<400> 103 

5Trp Leu Arg Thr Gly Asp Leu Gly Phe 
1 5 

<210> 104 
<211> 8 
10<212> PRT 

<213> Bacillus brevis 

<400> 104 

Tyr Lys Thr Gly Asp Gin Ala Arg 
15 1 5 

<210> 105 
<211> 8 
<212> PRT 
2 0<213> Cochliobolus carbonum 

<400> 105 

Tyr Arg Thr Gly Asp Leu Val Arg 
1 5 

25 

<210> 106 
<211> 8 
<212> PRT 

<213> Cochliobolus carbonum 

30 

<400> 106 

Tyr Lys Thr Gly Asp Leu Val Arg 
1 5 

35<210> 107 
<211> 8 
<212> PRT 

<213> Fusarium scirpi 
40<400> 107 



WO 02/42444 



Tyr Arg Thr Gly Asp Leu Ala Cys 
1 5 

<210> 108 
5<211> 8 
<212> PRT 

<213> Fusarium scirpi 

<400> 108 
lOTyr Arg Thr Gly Asp Arg Met Arg 
1 5 

<210> 109 
<211> 8 
15<212> PRT 

<213> Aspergillus nidulans 

<400> 109 

Tyr Lys Thr Gly Asp Leu Ala Arg 
20 1 5 

<210> 110 
<211> 8 
<212> PRT 
25<213> Aspergillus nidulans 

<400> 110 

Tyr Lys Thr Gly Asp Leu Val Arg 
1 5 

30 

<210> 111 
<211> 8 
<212> PRT 

<213> Tolypocladium nivenm 

35 

<400> 111 

Tyr Arg Thr Gly Asp Arg Ala Arg 
1 5 



40<210> 112 
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<211> 8 
<212> PRT 

<213> Tolypocladium nivenm 

5<400> 112 
Tyr Arg Thr Gly Asp Arg Ala Arg 
1 5 

<210> 113 
10<211> 21 
<212> PRT 

<213> Artificial Sequence 
<220> 

15<223> Consensus sequence 

<221> SITE 

<222> 4, 6, 13 

<22 3> Xaa = any amino acid 

20 

<400> 113 

Leu Gly Arg Xaa Asp Xaa Gin Val Lys lie Arg Gly Xaa Arg lie Glu 

15 10 15 

Leu Gly Glu Val Glu 
25 20 

<210> 114 
<211> 18 
<212> PRT 
3 0<213> Cochliobolus heterostrophus 

<400> 114 

Leu Gly Leu Tyr Glu Asp Arg lie Arg Gin Arg Val Glu Asn Gly Gin 
15 10 15 

3 5Leu Glu 



<210> 115 
<211> 21 
4 0<212> PRT 
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<213> Bacillus brevis 
<400> 115 

Leu Gly Arg lie Asp Asn Gin Val Lys lie Arg Gly His Arg Val Glu 
5 1 5 10 15 

Leu Glu Glu Val Glu 
20 

<210> 116 
10<211> 21 
<212> PRT 

<213> Cochliobolus carbonum 
<400> 116 

15Leu Gly Arg Lys Asp Thr Gin Val Lys Met Asn Gly Gin Arg Phe Glu 
15 10 15 

Leu Gly Glu Val Glu 
20 

20<210> 117 
<211> 21 
<212> PRT 

<213> Cochliobolus carbonum 
25<400> 117 

Val Gly Arg Ser Asp Thr Gin lie Lys Leu Ala Gly Gin Arg Val Glu 

15 10 15 

Leu Gly Asp Val Glu 
20 

30 

<210> 118 
<211> 21 
<212> PRT 

<213> Fusarium scirpi 

35 

<400> 118 

Leu Gly Arg Met Asp Ser Gin Val Lys lie Arg Gly Gin Arg Val Glu 

15 10 15 

Leu Gly Ala Val Glu 
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<210> 119 
<211> 21 
<212> PRT 

<213> Fu sari urn scirpi 

5 

<400> 119 

Phe Gly Arg Met Asp Asn Gin Phe L»ys He Arg Gly Asn Arg lie Glu 

15 10 15 

Ala Gly Glu Val Glu 
10 20 

<210> 120 
<211> 21 
<212> PRT 
15<213> Aspergillus nidulans 

<400> 120 

Leu Gly Arg Ala Asp Phe Gin He Lys Leu Arg Gly He Arg He Glu 
15 10 15 

20Pro Gly Glu He Glu 
20 

<210> 121 
<211> 21 
25<212> PRT 

<213> Aspergillus nidulans 

<400> 121 

Leu Gly Arg Asn Asp Phe Gin Val Lys He Arg Gly Leu Arg He Glu 
30 1 5 10 15 

Leu Gly Glu He Glu 
20 

<210> 122 
35<211> 21 
<212> PRT 

<213> Tolypocladium nivenm 
<400> 122 

4 0 Phe Gly Arg Met Asp Gin Gin Val Lys He Arg Gly His Arg He Glu 
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15 10 15 

Pro Ala Glu Val Glu 
20 



5<210> 123 
<211> 21 
<212> PRT 

<213> Tolypocladium nivenm 



10<400> 123 

Phe Gly Arg Met Asp His Gin Val Lys Val Arg Gly His Arg lie Glu 

15 10 15 

Leu Ala Glu Val Glu 
20 

15 

<210> 124 
<211> 21 
<212> PRT 

<213> Cochliobolus heterostrophus 

20 

<400> 124 

Leu Gly Ser lie Gly Asp Thr Phe Glu Val Asn Gly Leu Asn His Phe 

15 10 15 

Ser Met Asp lie Glu 
25 20 



<210> 125 
<211> 21 
<212> PRT 
3 0<213> Myxococcus xanthus 



<400> 125 

Ser Gly Arg Arg Lys Asp Leu Leu Val lie Arg Gly Arg Asn Tyr Tyr 
15 10 15 

3 5Pro Gin Asp Leu Glu 
20 



<210> 
<211> 
40<212> 



126 

13 

PRT 
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<213> Artificial Sequence 
<220> 

<223> Consensus sequence 

5 

<221> SITE 

<222> 3-4, 10, 12-13 
<223> Xaa = any amino acid 



10<400> 126 

Phe Phe Xaa Xaa Gly Gly Asp Ser Leu Xaa Ala Xaa Xaa 
15 10 



<210> 127 
15<211> 13 
<212> PRT 

<213> Cochliobolus heterostrophus 



<400> 127 

2 0Leu Asp lie Pro Phe Leu Asp Ser Leu Ser Glu Arg Cys 
15 10 



<210> 128 
<211> 13 
25<212> PRT 

<213> Cochliobolus heterostrophus 



<400> 128 

Arg Asp Pro Asn Gly Gin Asp Ser Gin Met lie Thr Glu 
30 1 5 10 



<210> 129 
<211> 13 
<212> PRT 
3 5<213> Myxococcus xanthus 



<400> 129 

Leu Pro Asp Leu Gly Leu Asp Ser Leu Ala Leu Val Glu 
15 10 
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<210> 130 
<211> 13 
<212> PRT 

<213> Bacillus brevis 

5 

<400> 130 

Phe Tyr Ala Leu Gly Gly Asp Ser lie Lys Ala lie Gin 
15 10 



10<210> 131 
<211> 13 
<212> PRT 

<213> Cochliobolus carbonum 



15<400> 131 

Phe lie His Ala Gly Gly Asp Ser lie Thr Ala Met Gin 
15 10 



<210> 132 
20<211> 13 
<212> PRT 

<213> Cochliobolus carbonum 



<400> 132 

2 5Phe Phe Ser Ser Gly Gly Asn Ser Met Ala Ala lie Ala 
15 10 



<210> 133 
<211> 13 
30<212> PRT 

<213> Fusarium scirpi 



<400> 133 

Phe Phe Glu Met Gly Gly Asn Ser lie lie Ala lie Lys 
35 1 5 10 



<210> 134 
<211> 13 
<212> PRT 
4 0<213> Fusarium scirpi 
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<400> 134 

Phe Phe Gin Leu Gly Gly His Ser Leu Leu Ala Thr Lys 
15 10 



5<210> 135 
<211> 13 
<212> PRT 

<213> Aspergillus nidulans 



10<400> 135 

Phe Phe Arg Leu Gly Gly His Ser lie Thr Cys lie Gin 
15 10 



<210> 136 
15<211> 13 
<212> PRT 

<213> Aspergillus nidulans 



<400> 136 

2 0Phe Phe Ser Leu Gly Gly Asp Ser Leu Lys Ser Thr Lys 
15 10 



<210> 137 
<211> 13 
25<212> PRT 

<213> Tolypocladium nivenm 



<400> 137 

Phe Phe Asp Leu Gly Gly His Ser Leu Thr Ala Met Lys 
30 1 5 10 



<210> 138 
<211> 13 
<212> PRT 
3 5<213> Tolypocladium nivenm 



<400> 138 

Phe Phe Asn Val Gly Gly His Ser Leu Leu Ala Thr Lys 
15 10 
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<210> 139 
<211> 16 
<212> PRT 

<213> Artificial Sequence 

5 

<220> 

<223> Consensus sequence 

<221> SITE 
10<222> 1-4, 6, 8, 10-12, 16 
<223> Xaa = any amino acid 



<400> 139 

Xaa Xaa Xaa Xaa Gly Xaa Ser Xaa Gly Xaa Xaa Xaa Ala Phe Glu Xaa 
15 1 5 10 15 



<210> 140 
<211> 16 
<212> PRT 
2 0<213> Cochliobolus heterostrophus 



<400> 140 

Val Leu Arg Pro Gly Pro Ser Ser Gly Ser Glu Gin His Asp Gin Ala 
1 5 10 15 

<210> 141 
<211> 16 
<212> PRT 

<213> Aspergillus nidulans 

■ 

<400> 141 

Tyr His Phe lie Gly Trp Ser Phe Gly Gly Thr lie Ala Met Glu lie 
15 10 15 



35<210> 142 
<211> 16 
<212> PRT 

<213> Bacillus brevis 



40<400> 142 
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Tyr Val Leu lie Gly Tyr Ser Ser Gly Gly Asn Leu Ala Phe Glu Val 
15 10 15 

<210> 143 
5<211> 16 
<212> PRT 

<213> Bacillus brevis 
<400> 143 

lOPhe Ala Phe Leu Gly His Ser Met Gly Ala Leu lie Ser Phe Glu Leu 
15 10 15 

<210> 144 
<211> 16 
15<212> PRT 

<213> Myxococcus xanthus 

<400> 144 

Leu Thr Leu Phe Gly Tyr Ser Ala Gly Cys Ser Leu Ala Phe Glu Ala 
20 1 5 10 15 

<210> 145 
<211> 16 
<212> PRT 
25<213> Brevibacillus brevis 

<400> 145 

Tyr Thr Leu Met Gly Tyr Ser Ser Gly Gly Asn Leu Ala Phe Glu Val 
15 10 15 

30 

<210> 146 
<211> 16 
<212> PRT 

<213> Brevibacillus brevis 

35 

<400> 146 

Phe Ala Phe Phe Gly His Ser Met Gly Gly Leu Val Ala Phe Glu Leu 
15 10 15 



40<210> 147 
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<211> 5 
<212> PRT 

<213> Artificial Sequence 
5<220> 

<223> Consensus sequence 

<221> SITE 
<222> 2, 4 
10<223> Xaa = any amnio acid 

<400> 147 

Gly Xaa Ser Xaa Gly 
1 5 

15 

<210> 148 
<211> 19 
<212> DNA 

<213> Artificial Sequence 

20 

<220> 

<223> Primer 
<400> 148 

25gctagcatgg ccctcacac 19 

<210> 149 
<211> 18 
<212> DNA 
30<213> Artificial Sequence 

<220> 

<223> Primer 
35<400> 149 

acgatcaggg ttggagaa 18 



<210> 150 
<211> 17 
40<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Primer 

5 

<400> 150 

agcaaagcgc attcctc 

<210> 151 
10<211> 24 
<212> DNA 

<213> Artificial Sequence 

<220> 
15<223> Primer 

<400> 151 

gtctctatct agctacggca ttgt 

20<210> 152 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

25<220> 

<223> Primer 

<400> 152 

gacggtccgc tagtatccat 

30 

<210> 153 
<211> 23 
<212> DNA 

<213> Artificial Sequence 

35 

<220> 

<223> Primer 

<400> 153 
40acgtctcaag tcaatgccca ata 



100 



17 



24 



20 
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<210> 154 
<211> 24 
<212> DNA 

<213> Artificial Sequence 

5 

<220> 

<223> Primer 

<400> 154 
lOcaaactcgga tctcttctct acag 

<210> 155 
<211> 18 
<212> DNA 
15<213> Artificial Sequence 

<220> 

<223> Primer 

20<400> 155 

cacgatggcg gcttacag 

<210> 156 
<211> 18 
25<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Primer 

30 

<400> 156 

acacaaggcc gatgaagc 

<210> 157 
35<211> 19 
<212> DNA 

<213> Artificial Sequence 

<220> 
40<223> Primer 



101 



24 



18 



WO 02/42444 



PCT/US01/43381 



102 

<400> 157 

cgtcgacgta tccatcctt 19 

<210> 158 
5<211> 21 
<212> DNA 

<213> Artificial Sequence 

<220> 
10<223> Primer 

<400> 158 

ttccagcgcg taagtaagtc a 21 

15<210> 159 
<211> 23 
<212> DNA 

<213> Artificial Sequence 

20<220> 

<223> Primer 

<400> 159 

actcggaccg gacggaataa caa 23 

25 

<210> 160 
<211> 19 
<212> DNA 

<213> Artificial Sequence 

30 

<220> 

<223> Primer 
<400> 160 

3 5gccatgtgca gtgaagagg 19 

<210> 161 
<211> 17 
<212> DNA 
40<213> Artificial Sequence 
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<220> 

<223> Primer 

<400> 161 
Scatggcgact ccctgtt 

<210> 162 
<211> 22 
<212> DNA 
10<213> Artificial Sequence 

<220> 

<223> Primer 

15<400> 162 

gtcatgtcta cccttcctct ca 

<210> 163 
<211> 22 
2 0<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Primer 

25 

<400> 163 

acatatagtt tggacgctct gc 

<210> 164 
30<211> 24 
<212> DNA 

<213> Artificial Sequence 

<220> 
35<223> Primer 

<400> 164 

tatcgcccta cctacaacgc acta 



103 



17 



22 



22 



40<210> 165 
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<211> 19 
<212> DNA 

<213> Artificial Sequence 

5<220> 
<223> Primer 

<400> 165 

ggacgagggc tttagtggt 

10 

<210> 166 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

15 

<220> 

<223> Primer 

<400> 166 
2 0gcctagcaaa tggagtaaag 

<210> 167 
<211> 19 
<212> DNA 
25<213> Artificial Sequence 

<220> 

<223> Primer 

30<400> 167 

ggcctacccg cttctatca 

<210> 168 
<211> 20 
35<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Primer 

40 



104 



19 



20 
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<400> 168 

agccctctgc agacgaatcc 2 0 

<210> 169 
5<211> 19 
<212> DNA 

<213> Artificial Sequence 

<220> 
10<223> Primer 

<400> 169 

atcaggcgag aaggtgttg 19 

15<210> 170 
<211> 18 
<212> DNA 

<213> Artificial Sequence 

20<220> 

<223> Primer 

<400> 170 

tggcgttgct tcctacag 18 

25 

<210> 171 
<211> 19 
<212> DNA 

<213> Artificial Sequence 

30 

<220> 

<223> Primer 
<400> 171 

3 5tgggcagcaa atggcacag 19 

<210> 172 
<211> 18 
<212> DNA 
40<213> Artificial Sequence 
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<220> 

<223> Primer 

<400> 172 
Sgcgctgcaca acccatca 

<210> 173 
<211> 22 
<212> DNA 
10<213> Artificial Sequence 

<220> 

<223> Primer 

15<400> 173 

ctgccaagga atttcatcaa gt 

<210> 174 
<211> 21 
20<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Primer 

25 

<400> 174 

tgtgttgacc tccactagct c 

<210> 175 
30<211> 21 
<212> DNA 

<213> Artificial Sequence 

<220> 
35<223> Primer 

<400> 175 

cgctgacgtt tgaccatctg a 



106 



18 



22 



21 



40<210> 176 
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<211> 21 
<212> DNA 

<213> Artificial Sequence 

5<220> 
<223> Primer 

<400> 176 

ctctcaaccc acaacctaac c 

10 

<210> 177 
<211> 22 
<212> DNA 

<213> Artificial Sequence 

15 

<220> 

<223> Primer 

<400> 177 
20ttcttcaaag tactcgtgtt cc 

<210> 178 
<211> 21 
<212> DNA 
25<213> Artificial Sequence 

<220> 

<223> Primer 

30<400> 178 

gttgcgtagt ggccgatgaa t 

<210> 179 
<211> 18 
35<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Primer 

40 



107 



21 



22 
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<400> 179 

atgcttcgcg ggttatgg 18 

<210> 180 
5<211> 18 
<212> DNA 

<213> Artificial Sequence 

<220> 
10<223> Primer 

<400> 180 

gcgaagatgt gcgtgttg 18 

15<210> 181 
<211> 21 
<212> DNA 

<213> Artificial Sequence 

20<220> 

<223> Primer 

<400> 181 

agacccagct gttgcccatt g 21 

25 

<210> 182 
<211> 21 
<212> DNA 

<213> Artificial Sequence 

30 

<220> 

<22 3> Primer 
<400> 182 

35tttgggtccg aagtagagat t 21 

<210> 183 
<211> 19 
<212> DNA 
40<213> Artificial Sequence 
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<220> 

<223> Primer 



<400> 183 
Sggcaagaatc gaccctacc 

<210> 184 
<211> 711 
<212> PRT 
10<213> Saccharomyces cerevisiae 



<400> 184 

Met Tyr Trp Val Leu 
1 5 
15Gly Ala Ser Ala Ser 
20 

Leu Val Leu Thr Asp 
35 

Glu Tyr Ser Ser Asp 
20 50 

Pro Ala Leu Gin Ser 
65 

Lys Gly Tyr Ser Asn 
85 

25Glu Asp Cys Ala Tyr 
100 

Asp Phe Tyr Asn Met 
115 

Ser Glu Gly Ser Ala 
30 130 

Val Arg Glu Asn Tyr 
145 

Asp lie Gly His Thr 
165 

35Val Met lie Leu Ala 
180 

Lys Thr Ala Leu Phe 
195 

Leu Thr lie Pro Thr 
40 210 



Leu Cys Gly Ser lie 
10 

Pro Ala Lys Thr Lys 
25 

Ala Cys Met Gly Val 
40 

Asp Leu Tyr Ser Ser 
55 

Met Leu Tyr Cys lie 
70 

Arg Thr Phe Glu Lys 
90 

Tyr Thr Asp Asn Leu 
105 

Leu Asn Asn Gly Thr 
120 

Asn Leu Thr Tyr Pro 
135 

Tyr Tyr Ser Tyr His 
150 

Tyr Gly Gly lie lie 
170 

Ser lie Leu His Tyr 
185 

Lys Gin Arg Leu Val 
200 

lie Trp Gly Lys His 
215 



Leu Leu Cys Cys Leu Ser 
15 

Met Tyr Gly Lys Leu Pro 
30 

Leu Gly Glu Val Thr Trp 
45 

Pro Ala Cys Thr Tyr Glu 
60 

Tyr Glu Ser Leu Asn Glu 
75 80 
Thr Phe Ala Ala lie Lys 
95 

Gin Asn Met Thr Asn Ala 
110 

Thr Tyr lie lie Gin Tyr 
125 

lie Glu Met Asp Ala Gin 
140 

Gly Phe Tyr Ala Asn Tyr 
155 160 
Cys Ala Tyr Phe Val Gly 
175 

Leu Ser Tyr Thr Pro Phe 
190 

Arg Tyr Val Arg Arg Tyr 
205 

Ala Ser Ser Phe Ser Tyr 
220 
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Leu Lys lie Phe Thr Gly Phe Leu Pro Thr Arg Ser Glu Gly Val lie 
225 230 235 240 

lie Leu Gly Tyr Leu Val Leu His Thr Val Phe Leu Ala Tyr Gly Tyr 
245 250 255 

5Gln Tyr Asp Pro Tyr Asn Leu He Phe Asp Ser Arg Arg Glu Gin He 
260 265 270 

Ala Arg Tyr Val Ala Asp Arg Ser Gly Val Leu Ala Phe Ala His Phe 

275 280 285 

Pro Leu lie Ala Leu Phe Ala Gly Arg Asn Asn Phe Leu Glu Phe lie 
10 290 295 300 

Ser Gly Val Lys Tyr Thr Ser Phe He Met Phe His Lys Trp Leu Gly 
3CTs 310 315 320 

Arg Met Met Phe Leu Asp Ala Val He His Gly Ala Ala Tyr Thr Ser 
325 330 335 

15 Tyr Ser Val Phe Tyr Lys Asp Trp Ala Ala Ser Lys Glu Glu Thr Tyr 
340 345 350 

Trp Gin Phe Gly Val Ala Ala Leu Cys He Val Gly Val Met Val Phe 

355 360 365 

Phe Ser Leu Ala Met Phe Arg Lys Phe Phe Tyr Glu Ala Phe Leu Phe 
20 370 375 380 

Leu His He Val Leu Gly Ala Leu Phe Phe Tyr Thr Cys Trp Glu His 
385 390 395 400 

Val Val Glu Leu Ser Gly He Glu Trp lie Tyr Ala Ala He Ala He 
405 410 415 

25Trp Thr He Asp Arg Leu He Arg He Val Arg Val Ser Tyr Phe Gly 
420 425 430 

Phe Pro Lys Ala Ser Leu Gin Leu Val Gly Asp Asp He He Arg Val 

435 440 445 

Thr Val Lys Arg Pro Val Arg Leu Trp Lys Ala Lys Pro Gly Gin Tyr 
30 450 455 460 

Val Phe Val Ser Phe Leu His His Leu Tyr Phe Trp Gin Ser His Pro 
465 470 475 480 

Phe Thr Val Leu Asp Ser He He Lys Asp Gly Glu Leu Thr He He 
485 490 495 

3 5Leu Lys Glu Lys Lys Gly Val Thr Lys Leu Val Lys Lys Tyr Val Cys 
500 505 510 

Cys Asn Gly Gly Lys Ala Ser Met Arg Leu Ala He Glu Gly Pro Tyr 

515 520 525 

Gly Ser Ser Ser Pro Val Asn Asn Tyr Asp Asn Val Leu Leu Leu Thr 
40 530 535 540 
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Gly Gly Thr Gly Leu Pro Gly Pro lie Ala His Ala lie Lys Leu Gly 
545 550 555 560 

Lys Thr Ser Ala Ala Thr Gly Lys Gin Phe lie Lys Leu Val lie Ala 
565 570 575 

5Val Arg Gly Phe Asn Val Leu Glu Ala Tyr Lys Pro Glu Leu Met Cys 
580 585 590 

Leu Glu Asp Leu Asn Val Gin Leu His lie Tyr Asn Thr Met Glu Val 

595 600 605 

Pro Ala Leu Thr Pro Asn Asp Ser Leu Glu lie Ser Gin Gin Asp Glu 
10 610 615 620 

Lys Ala Asp Gly Lys Gly Val Val Met Ala Thr Thr Leu Glu Gin Ser 
625 630 635 640 

Pro Asn Pro Val Glu Phe Asp Gly Thr Val Phe His His Gly Arg Pro 
645 650 655 

15Asn Val Glu Lys Leu Leu His Glu Val Gly Asp Leu Asn Gly Ser Leu 
660 665 670 

Ala Val Val Cys Cys Gly Pro Pro Val Phe Val Asp Glu Val Arg Asp 

675 680 685 

Gin Thr Ala Asn Leu Val Leu Glu Lys Pro Ala Lys Ala lie Glu Tyr 
20 690 695 700 

Phe Glu Glu Tyr Gin Ser Trp 
• 705 710 



<210> 185 
25<211> 1774 
<212> PRT 

<213> Cochliobolus heterostrophus 



<400> 185 

3 0Met Met Gly Asn Tyr 

1 5 
Gly Gin Phe Gly Ser 
20 

Glu Val Asn Gin Gly 
35 35 

Asp Asn Arg Asp Ser 
50 

Ala Phe Ser Pro Thr 
65 

4 0Asp Leu Pro Leu Gly 



Ala Phe Asn Pro Asp 
10 

Pro Gly Glu Ala Ser 
25 

Tyr Phe Ser Asp Phe 
40 

Tyr Gly Gly Pro Asn 
55 

Ala Ala lie Pro Pro 
70 

Ala Ala Glu Thr Met 



Asn Gin Gin Ser Tyr Asp 
15 

Arg Arg Ser Thr Met Leu 
30 

Thr Gly Gin Gin Met Gin 
45 

Arg Tyr Ser Ser Gly Asp 
60 

Pro Met Met Asn Pro Asn 
75 80 
Met Pro Leu Glu Pro Arg 
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Asp Leu Pro Phe 
100 

Ser Lys Phe Asp 
5 115 
Gin Pro Arg Thr 
130 

Thr Ala Ser lie 
145 

lOAla Lys Val lie 

Ala Leu Val Tyr 
180 

Met Gly Cys Phe 
15 195 

Asp Asp Tyr Gin 
210 

Leu Ala Leu Thr 
225 

2 0Ser Gin Asn Arg 

Asn Glu Phe Gly 
260 

Leu Gin Val Pro 
25 275 

Gly Asp Leu Arg 
290 

Met Ala Cys lie 
305 

3 0Ser Gin Asp Thr 

Val Ala Pro Ala 
340 

Leu Asp Pro Arg 
35 355 

Val Tyr Gly Gly 
370 

Thr Pro Gly Leu 
385 

4 0Leu Leu Ala Asp 



85 

Asp Val Tyr Asp 

Asn lie Gly Ala 
12 0 

Thr Ala Phe Trp 
135 

Thr Trp Glu Lys 
150 

Arg Asp Lys Ser 
165 

Arg Asp Thr Glu 

He Ala Gly Val 
200 

Lys Leu He Leu 
215 

Thr Asp Asn Asn 
230 

Leu Lys Trp Pro 
245 

Ser His His Pro 

Glu Val Ala Tyr 
280 

Gly Val Val Leu 
295 

Ser Ala Met He 
310 

Phe Ser Thr Ser 
325 

Pro Ser Arg Asn 

Glu Ser Ala Gly 
360 

His Thr Thr Val 
375 

Tyr Ala His Leu 
390 

Tyr Pro Gly Leu 



112 
90 

Pro His Asn Pro 
105 

Val Leu Arg His 

Val Leu Asp Ala 
140 

Val Ala Ser Arg 
155 

Asn Leu Tyr Arg 
170 

He He Asp Phe 
185 

Val Ala Val Pro 

Leu Leu Thr Thr 
220 

Leu Lys Ala Phe 
235 

Ser Gly Val Glu 
250 

Lys Lys His Asp 
265 

He Glu Phe Ser 

Ser His Arg Thr 
300 

Ser Thr He Pro 
315 

Leu Arg Asp Ala 
330 

Pro Thr Glu Val 
345 

Leu He Leu Ser 

Trp Leu Glu Thr 
380 

He Thr Lys Tyr 
395 

Lys Arg Ala Ala 



95 

Asn Val Lys Met 
110 

Arg Ser Arg Thr 
125 

Lys Gly Lys Glu 

Ala Glu Lys Val 
160 

Gly Asp Arg Val 
175 

Val Val Ala Leu 
190 

He Asn Ser Val 
205 

Thr Gin Ala His 

His Arg Asp He 
240 

Trp Trp Lys Thr 
255 

Asp Thr Pro Ala 
270 

Arg Ala Pro Thr 
285 

He Met His Gin 

Thr Asn Ala Gin 
320 

Glu Gly Lys Phe 
335 

He Leu Thr Tyr 
350 

Val Leu Phe Ala 
365 

Ala Thr Met Glu 

Lys Ser Asn He 
400 

Tyr Asn Tyr Gin 
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405 410 415 

Gin Asp Pro Met Ala Thr Arg Asn Phe Lys Lys Asn Thr Glu Pro Asn 

420 425 430 

Phe Ala Ser Val Lys lie Cys Leu lie Asp Thr Leu Thr Val Asp Cys 
5 435 440 445 

Glu Phe His Glu lie Leu Gly Asp Arg Tyr Phe Arg Pro Leu Arg Asn 

450 455 460 

Pro Arg Ala Arg Glu Leu lie Ala Pro Met Leu Cys Leu Pro Glu His 
465 470 475 480 

lOGly Gly Met lie lie Ser Val Arg Asp Trp Leu Gly Gly Glu Glu Arg 

485 490 495 

Met Gly Cys Pro Leu Ser lie Ala Val Glu Glu Ser Asp Asn Asp Glu 

500 505 510 

Asp Asp Thr Glu Asp Lys Tyr Ala Ala Ala Asn Gly Tyr Ser Ser Leu 
15 515 520 525 

lie Gly Gly Gly Thr Thr Lys Asn Lys Lys Glu Lys Lys Lys Lys Gly 

530 535 540 

Pro Thr Glu Leu Thr Glu lie Leu Leu Asp Lys Glu Ala Leu Lys Met 
545 550 555 560 

2 0Asn Glu Val lie Val Leu Ala He Gly Glu Glu Ala Ser Lys' Arg Ala 

565 570 575 

Asn Glu Pro Gly Thr Met Arg Val Gly Ala Phe Gly Tyr Pro He Pro 

580 585 590 

Asp Ala Thr Leu Ala He Val Asp Pro Glu Thr Ser Leu Leu Cys Ser 
25 595 600 605 

Pro Tyr Ser He Gly Glu He Trp Val Asp Ser Pro Ser Leu Ser Gly 

610 615 620 

Gly Phe Trp Gin Leu Gin Lys His Thr Glu Thr He Phe His Ala Arg 
625 630 635 640 

3 0Pro Tyr Arg Phe Val Glu Gly Ser Pro Thr Pro Gin Leu Leu Glu Leu 

645 650 655 

Glu Phe Leu Arg Thr Gly Leu Leu Gly Phe Val Val Glu Gly Lys He 

660 665 670 

Phe Val Leu Gly Leu Tyr Glu Asp Arg He Arg Gin Arg Val Glu Trp 
35 675 680 685 

Val Glu Asn Gly Gin Leu Glu Ala Glu His Arg Tyr Phe Phe Val Gin 

690 695 700 

His Leu Val Thr Ser He Met Lys Ala Val Pro Lys He Tyr Asp Cys 
705 710 715 720 

40Ser Ser Phe Asp Ser Tyr Val Asn Gly Glu Tyr Leu Pro He He Leu 
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lie Glu Thr Gin 
740 

Pro Gin Gin Leu 
5 755 
Met Glu Val Leu 
770 

lie Thr Ala Pro 
785 

lOGlu lie Gly Asn 

Pro Cys Val His 
820 

Ala Leu Gly Asp 
15 835 

Met Ala Arg Gin 
850 

Val Asp His Arg 
865 

2 0Asn Gin Phe Ser 

Gin Ala Glu Glu 
900 

Gly Lys Gly Val 
25 915 

Ala Met Tyr Leu 
930 

Leu Leu Met Tyr 
945 

3 0Cys Phe Val Leu 

Asn Arg Leu Asn 
980 

Phe Lys Val Lys 
35 995 

Lys lie Lys Gin 
1010 

Lys lie Ser Val 
1025 

40Ser Ser Gly Cys 



725 

Ala Ala Ser Thr 

Asp lie Pro Phe 
760 

Tyr Gin Glu His 
775 

Asn Thr Leu Pro 
790 

Met Leu Cys Arg 
805 

Val Lys Phe Gly 

Asp Pro Ala Gly 
840 

Gin Phe Leu Met 
855 

Glu Val Val lie 
870 

Asn lie His Asp 
885 

Leu Ala Tyr Cys 

Asn Trp Lys Lys 
920 

Lys Asn Lys Val 
935 

Thr His Ser Glu 
950 

Gly Ala Val Cys 
965 

Glu Asp Ala Pro 

Ala lie Leu Val 
100< 

Val Ser Gin His 
1015 

Pro Asn Thr Tyr 

1030 
Arg Asp Leu Lys 



114 
730 

Ala Pro Thr Asn 
745 

Leu Asp Ser Leu 

His Leu Arg Val 
780 

Arg Val lie Lys 
795 

Arg Glu Phe Asp 
810 

lie Glu Arg Ser 
825 

Gly Met Trp Ser 

Leu Gin Asp Lys 
860 

Asp Asp Arg Thr 
875 

Leu Met Gin Trp 
890 

Thr Val Asp Gly 
905 

Phe Asp Gin Lys 

Lys Val Gin Ala 
940 

Glu Phe Val Tyr 
955 

lie Pro Met Ala 
970 

Ala Leu Leu His 
985 

Asn Ala Asp Val 

) 

lie Lys Gin Ser 
102 

Ser Thr Thr Lys 
1035 

Leu Thr lie Arg 



735 

Pro Gly Gly Pro 
750 

Ser Glu Arg Cys 
765 

Tyr Cys Val Met 

Asn Gly Arg Arg 
800 

Asn Gly Ser Leu 
815 

Val Gin Asn lie 
830 

Phe Glu Ala Ser 
845 

Gin Tyr Ser Gly 

Ser Thr Pro Leu 
880 

Arg Val Ser Arg 
895 

Arg Gly Lys Glu 
910 

Val Ala Gly Val 
925 

Gly Asp His Leu 

Ala Val His Ala 
960 

Pro lie Asp Gin 
975 

lie Leu Ala Asp 
990 

Asp His Leu Met 
1005 

Ala Ala lie Leu 

) 

Pro Pro Lys Gin 
1040 

Pro Ala Trp He 
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1045 



1050 



1055 



Gin Ala Gly Phe Pro Val Leu Val Trp Thr Tyr Trp Thr Pro Asp Gin 

1060 1065 1070 

Arg Arg lie Ala Val Gin Leu Gly His Ser Gin lie Met Ala Leu Cys 
i 1075 1080 1085 

Lys Val Gin Lys Glu Thr Cys Gin Met Thr Ser Thr Arg Pro Val Leu 

1090 1095 1100 

Gly Cys Val Arg Ser Thr lie Gly Leu Gly Phe Leu His Thr Cys Leu 



lOMet Gly lie Phe Leu Ala Ala Pro Thr Tyr Leu Val Ser Pro Val Asp 

1125 1130 1135 

Phe Ala Gin Asn Pro Asn lie Leu Phe Gin Thr Leu Ser Arg Tyr Lys 

1140 1145 1150 

lie Lys Asp Ala Tyr Ala Thr Ser Gin Met Leu Asp His Ala lie Ala 
15 1155 1160 1165 

Arg Gly Ala Gly Lys Ser Met Ala Leu His Glu Leu Lys Asn Leu Met 

1170 1175 1180 

He Ala Thr Asp Gly Arg Pro Arg Val Asp Val Tyr Gin Arg Val Arg 
1185 1190 1195 1200 

2 OVal His Phe Ala Pro Ala Asn Leu Asp Pro Thr Ala He Asn Thr Val 

1205 1210 1215 

Tyr Ser His Val Leu Asn Pro Met Val Ala Ser Arg Ser Tyr Met Cys 

1220 1225 1230 

He Glu Pro Val Glu Leu His Leu Asp Val His Ala Leu Arg Arg Gly 
25 1235 1240 1245 

Leu Val Met Pro Val Asp Pro Asp Thr Glu Pro Asn Ala Leu Leu Val 

1250 1255 1260 

Gin Asp Ser Gly Met Val Pro Val Ser Thr Gin He Ser He Val Asn 
1265 1270 1275 1280 

3 0Pro Glu Thr Asn Gin Leu Cys Leu Asn Gly Glu Tyr Gly Glu He Trp 

1285 1290 1295 

Val Gin Ser Glu Ala Asn Ala Tyr Ser Phe Tyr Met Ser Lys Glu Arg 

1300 1305 1310 

Leu Asp Ala Glu Arg Phe Asn Gly Arg Thr He Asp Gly Asp Pro Asn 
35 1315 1320 1325 

Val Arg Tyr Val Arg Thr Gly Asp Leu Gly Phe Leu His Ser Val Thr 

1330 1335 1340 

Arg Pro He Gly Pro Asn Gly Ala Pro Val Asp Met Gin Val Leu Phe 
1345 1350 1355 1360 

4 OVal Leu Gly Ser He Gly Asp Thr Phe Glu Val Asn Gly Leu Asn His 



1105 



1110 



1115 



1120 
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1365 



1370 



1375 



Phe Ser Met Asp lie Glu Gin Ser Val Glu Arg Cys His Arg Asn lie 

1380 1385 1390 

Val Pro Gly Gly Cys Ala Val Phe Gin Ala Gly Gly Leu Val Val Val 

i 1395 1400 1405 

Val Val Glu lie Phe Arg Arg Asn Phe Leu Ala Ser Met Val Pro Val 

1410 1415 1420 

He Val Asn Ala He Leu Asn Glu His Gin Leu Val He Asp He Val 



lOSer Phe Val Gin Lys Gly Asp Phe His Arg Ser Arg Leu Gly Glu Lys 

1445 1450 1455 

Gin Arg Gly Lys He Leu Ala Gly Trp Val Thr Arg Lys Met Arg Thr 

1460 1465 1470 

He Ala Gin Tyr Ser He Arg Asp Pro Asn Gly Gin Asp Ser Gin Met 
15 1475 1480 1485 

Met He Thr Glu Glu Pro Gly Pro Arg Ala Ser Met Thr Gly Ser Met 

1490 1495 1500 

Leu Gly Arg Met Gly Gly Pro Ala Ser He Lys Ala Gly Ser Thr Arg 
1505 1510 1515 1520 

2 0Ala Pro Ser Leu Met Gly Met Thr Ala Thr Met Asn Asn Leu Ser Leu 

1525 1530 1535 

Thr Gin Gin Gin Gin Gin Gin Tyr Gin Gin Pro Gly Met Tyr Ala Gin 

1540 1545 1550 

Gin Gin Gly Met His Pro Gin Gin Gin His Gin Phe Ser Met Ser Asn 
25 1555 1560 1565 

Thr Pro Pro Gin Gly Pro Pro Gin Gly Val Glu Leu His Asp Pro Ser 

1570 1575 1580 

Asp Arg Thr Pro Thr Asp Asn Arg His Ser Phe Leu Ala Asp Pro Arg 
1585 1590 1595 1600 

3 0Met Gin Asn Gin Gly Gin Met Asn Glu Thr Gly Ala Tyr Glu Pro Met 

1605 1610 1615 

Asn Tyr Gin Asn Ala Tyr His Pro His Gin Gin Gin Tyr Glu Ser Glu 

1620 1625 1630 

Asp Gly Gly Ser Arg Leu Ser Gly Pro Val Pro Asp Val Leu Arg Pro 
35 1635 1640 1645 

Gly Pro Ser Ser Gly Ser He Glu Gin His Asp Gin Ala Asn Asn Asp 

1650 1655 1660 

Asn Asn Met Trp Asn Asn Arg Glu Tyr Tyr Gly Asn Ser Pro Ser Tyr 
1665 1670 1675 1680 

4 0Ala Gly Gly Tyr Thr Gin Asp Gly Asn He His Glu Gin Gin Gin His 



1425 



1430 



1435 



1440 
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1685 1690 1695 

Asp Glu Tyr Thr Ser Asn Ala Ser Tyr Gly Gly Asn Gin Gly Ala Gly 

1700 1705 1710 

Gly Gly Ser Gly Gly Gly Gly Gly Leu Arg Val Ala Asn Arg Asp Ser 
5 1715 1720 1725 

Ser Asp Ser Glu Gly Ala Asp Asp Asp Ala Trp Arg Arg Asp Ala Leu 

1730 1735 1740 

Ala Gin lie Asn Phe Ala Gly Gly Ala Ala Ala Ala Ser Ala Gly Ala 
1745 1750 1755 1760 

lOPro Ala Ala Gly Ala Ser Ser Ser Gin Pro Gly His Ala Gin 

1765 1770 



<210> 186 

<211> 530 

15<212> PRT 

<213> Cochliobolus heterostrophus 



<400> 186 

Lys Lys Lys Gly Pro Thr Glu Leu Thr Glu lie Leu Leu Asp Lys Glu 
20 1 5 10 15 

Ala Leu Lys Met Asn Glu Val lie Val Leu Ala lie Gly Glu Glu Ala 

20 25 30 

Ser Lys Arg Ala Asn Glu Pro Gly Thr Met Arg Val Gly Ala Phe Gly 
35 40 45 

25Tyr Pro lie Pro Asp Ala Thr Leu Ala lie Val Asp Pro Glu Thr Ser 
50 55 60 

Leu Leu Cys Ser Pro Tyr Ser lie Gly Glu lie Trp Val Asp Ser Pro 
65 70 75 80 

Ser Leu Ser Gly Gly Phe Trp Gin Leu Gin Lys His Thr Glu Thr lie 
30 85 90 95 

Phe His Ala Arg Pro Tyr Arg Phe Val Glu Gly Ser Pro Thr Pro Gin 

100 105 110 

Leu Leu Glu Leu Glu Phe Leu Arg Thr Gly Leu Leu Gly Phe Val Val 
115 120 125 

3 5Glu Gly Lys lie Phe Val Leu Gly Leu Tyr Glu Asp Arg lie Arg Gin 
130 135 140 

Arg Val Glu Trp Val Glu Asn Gly Gin Leu Glu Ala Glu His Arg Tyr 
145 150 155 160 

Phe Phe Val Gin His Leu Val Thr Ser lie Met Lys Ala Val Pro Lys 
40 165 170 175 
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lie Tyr Asp Cys Ser Ser Phe Asp Ser Tyr Val Asn Gly Glu Tyr Leu 

180 185 190 

Pro He He Leu He Glu Thr Gin Ala Ala Ser Thr Ala Pro Thr Asn 
195 200 205 

5Pro Gly Gly Pro Pro Gin Gin Leu Asp He Pro Phe Leu Asp Ser Leu 
210 215 220 

Ser Glu Arg Cys Met Glu Val Leu Tyr Gin Glu His His Leu Arg Val 
225 230 235 240 

Tyr Cys Val Met He Thr Ala Pro Asn Thr Leu Pro Arg Val He Lys 
10 245 250 255 

Asn Gly Arg Arg Glu He Gly Asn Met Leu Cys Arg Arg Glu Phe Asp 

260 265 270 

Asn Gly Ser Leu Pro Cys Val His Val Lys Phe Gly He Glu Arg Ser 
275 280 285 

15Val Gin Asn He Ala Leu Gly Asp Asp Pro Ala Gly Gly Met Trp Ser 
290 295 300 

Phe Glu Ala Ser Met Ala Arg Gin Gin Phe Leu Met Leu Gin Asp Lys 
305 310 315 320 

Gin Tyr Ser Gly Val Asp His Arg Glu Val Val He Asp Asp Arg Thr 
20 325 330 335 

Ser Thr Pro Leu Asn Gin Phe Ser Asn He His Asp Leu Met Gin Trp 

340 345 350 

Arg Val Ser Arg Gin Ala Glu Glu Leu Ala Tyr Cys Thr Val Asp Gly 
355 360 365 

25Arg Gly Lys Glu Gly Lys Gly Val Asn Trp Lys Lys Phe Asp Gin Lys 
370 375 380 

Val Ala Gly Val Ala Met Tyr Leu Lys Asn Lys Val Lys Val Gin Ala 
385 390 395 400 

Gly Asp His Leu Leu Leu Met Tyr Thr His Ser Glu Glu Phe Val Tyr 
30 405 410 415 

Ala Val His Ala Cys Phe Val Leu Gly Ala Val Cys He Pro Met Ala 

420 425 430 

Pro He Asp Gin Asn Arg Leu Asn Glu Asp Ala Pro Ala Leu Leu His 
435 440 445 

35Ile Leu Ala Asp Phe Lys Val Lys Ala He Leu Val Asn Ala Asp Val 
450 455 460 

Asp His Leu Met Lys He Lys Gin Val Ser Gin His He Lys Gin Ser 
465 470 475 480 

Ala Ala He Leu Lys He Ser Val Pro Asn Thr Tyr Ser Thr Thr Lys 
40 485 490 495 
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Pro Pro Lys Gin Ser Ser Gly Cys Arg Asp Leu 



Lys Leu Thr lie Arg 



500 



505 



510 



Pro Ala Trp lie Gin Ala Gly Phe Pro Val Leu 



Val Trp Thr Tyr Trp 



515 



520 



525 



5 Thr Pro 
530 

<210> 187 
<211> 1767 
10<212> DMA 

<213> Cochliobolus heterostrophus 



<400> 187 



atggtctctt 


caatttcgag 


ggtggttacc 


gcccttgggc 


ttcttttgcc 


cactgt cact 


o U 


15tcatctgtgt 


acctcataaa 


agtctcgccc 


ccagaacacc 


ggagcaaacg 


acaaggatat 


ion 


gagactgcct 


gtaatcatgg 


tccagaatca 


agaggatgct 


ggattgacga 


cttcaacatc 


ion 


gacaccgaca 


4— ****** —S. 4- ,~ 4— 4— 

tggatgttga 


atggccagat 


actggaaaga 


cagt caagta 


t cacctgacc 


n / n 
z4U 


atcaccaata 


ccactggagc 


tccagacggt 


tttgaaaggc 


cgatgt tctt 


gattaatggc 


jUu 


caatacccag 


gaccaactat 


tactgccgac 


tggggagatg 


ttctagagat 


cacagttacc 


"3 cz n 


2 Oaatggccttg 


aaaacaacgg 


fcacaggtata 


cattggcacg 


gtctgaggca 


act cgggaca 




aacgaacaag 


atggcgtaaa 


tggtat cact 


gaatgcccaa 


tcgcacccgg 


tgact ccaag 


a q n 


ctctacagat 


tcaaagcaac 


tcaatatggc 


actacctggt 


atcactcgca 


ctactcggtg 


540 


cagtatggtg 


acggcatcgt 


gggtcctctg 


atcatcaaag 


gaccctcaac 


ggcgaactac 


600 


gatattgatc 


ttggcgcttt 


cccaatgact 


gactggtttc 


acgcaaccac 


cttcaccgtc 


660 


2 5aacgctgcag 


ccgttcatgc 


aaatggccct 


ccaactgctg 


acaatgtcct 


tgtcaatggc 


720 


tccatgacct 


catcttttgg 


cggcaagtac 


gccgaaacga 


tcctaactcc 


gggaaaatct 


780 


cacttgctgc 


gtttgatgaa 


cgttggtatt 


aacaactacc 


ttcatgtcgg 


cctcgatggg 


840 


catcagttcc 


aggtcatttc 


ggctgatttc 


acgcccattg 


aacctttcta 


cacggacagc 


900 


ttggtccttg 


cagtcggtca 


acggtatgaa 


gtcatcatca 


acgcaactga 


agctgtgggc 


960 


3 Oaactactggc 


tacgtgttgg 


taccggcggt 


aactgcgacg 


gtcccaatgc 


caatgcagca 


1020 


aatatcagga 


gtatcttccg 


atatgctggc 


gctccaactg 


aagacccaga 


cacgactggt 


1080 


tcgcttccgt 


cgggctgcta 


cgatgaggat 


gttgtaccct 


atgccaagac 


gactgttcct 


1140 


caggagatgc 


ccgaacagtt 


gagcgtgggc 


ttcaacccta 


actggactag 


tgacgtgacg 


1200 


caaaatcagg 


gtctggtcca 


atggctcgtc 


aacggtaatc 


ccatggcagt 


tgatcttgaa 


1260 


35gtccctactc 


tgcagtcggt 


gttggatggc 


aatgttacct 


acggaaacaa 


ccgccacgtg 


1320 


tttgcagtcg 


acgagaaaca 


ccaatggcaa 


tattgggtca 


tccaacaaaa 


cagttctaac 


1380 


ccaccacttc 


ctcaccccat 


ccacctccac 


ggccacgact 


tctacgtcct 


cgcacaggtc 


1440 


gaaaacgcag 


tctggaacgg 


agatatttca 


accctgaaga 


cggacaaccc 


catccgtcgg 


1500 


gacacggccg 


atcttcccgc 


tggaggctac 


ttggtccttg 


ctttcgagtc 


ggacaaccct 


1560 


4 Oggcgcatggc 


ttatgcactg 


ccacatcccc 


ttccacgttg 


ctgccggtct 


cggtgtccag 


1620 
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ttcctcgagc gcgaatccga aatcaaggcc caagatggat acgcagagat gcacaggaca 168 0 

tgtgctaact ggcagtcatg gcgctacaag taccatccca atggcatctt gttccccggt 174 0 

gactctggtc tacgtcgtcg caactaa 1767 



5<210> 188 
<211> 588 
<212> PRT 

<213> Cochliobolus heterostrophus 



10<400> 188 

Met Val Ser Ser lie 

1 5 
Pro Thr Val Thr Ser 
20 

15His Arg Ser Lys Arg 
35 

Glu Ser Arg Gly Cys 
50 

Asp Val Glu Trp Pro 
2065 

lie Thr Asn Thr Thr 
85 

Leu lie Asn Gly Gin 
100 

25Asp Val Leu Glu lie 
115 

Gly lie His Trp His 
130 

Gly Val Asn Gly lie 
30145 

Leu Tyr Arg Phe Lys 
165 

His Tyr Ser Val Gin 
180 

35Lys Gly Pro Ser Thr 
195 

Met Thr Asp Trp Phe 
210 

Val His Ala Asn Gly 
40225 



Ser Arg Val Val Thr 
10 

Ser Val Tyr Leu lie 
25 

Gin Gly Tyr Glu Thr 
40 

Trp lie Asp Asp Phe 
55 

Asp Thr Gly Lys Thr 
70 

Gly Ala Pro Asp Gly 
90 

Tyr Pro Gly Pro Thr 
105 

Thr Val Thr Asn Gly 
120 

Gly Leu Arg Gin Leu 
135 

Thr Glu Cys Pro lie 
150 

Ala Thr Gin Tyr Gly 
170 

Tyr Gly Asp Gly He 
185 

Ala Asn Tyr Asp He 
200 

His Ala Thr Thr Phe 
215 

Pro Pro Thr Ala Asp 
230 



Ala Leu Gly Leu Leu Leu 
15 

Lys Val Ser Pro Pro Glu 
30 

Ala Cys Asn His Gly Pro 
45 

Asn He Asp Thr Asp Met 
60 

Val Lys Tyr His Leu Thr 
75 80 
Phe Glu Arg Pro Met Phe 
95 

He Thr Ala Asp Trp Gly 
110 

Leu Glu Asn Asn Gly Thr 
125 

Gly Thr Asn Glu Gin Asp 
140 

Ala Pro Gly Asp Ser Lys 
155 160 
Thr Thr Trp Tyr His Ser 
175 

Val Gly Pro Leu He He 
190 

Asp Leu Gly Ala Phe Pro 
205 

Thr Val Asn Ala Ala Ala 
220 

Asn Val Leu Val Asn Gly 
235 240 
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Ser Met Thr Ser Ser Phe Gly Gly Lys Tyr Ala Glu Thr lie Leu Thr 

245 250 255 

Pro Gly Lys Ser His Leu Leu Arg Leu Met Asn Val Gly lie Asn Asn 
260 265 270 

5Tyr Leu His Val Gly Leu Asp Gly His Gin Phe Gin Val lie Ser Ala 
275 280 285 

Asp Phe Thr Pro lie Glu Pro Phe Tyr Thr Asp Ser Leu Val Leu Ala 

290 295 300 

Val Gly Gin Arg Tyr Glu Val lie lie Asn Ala Thr Glu Ala Val Gly 
10305 310 315 320 

Asn Tyr Trp Leu Arg Val Gly Thr Gly Gly Asn Cys Asp Gly Pro Asn 

325 330 335 

Ala Asn Ala Ala Asn lie Arg Ser lie Phe Arg Tyr Ala Gly Ala Pro 
340 345 350 

15Thr Glu Asp Pro Asp Thr Thr Gly Ser Leu Pro Ser Gly Cys Tyr Asp 
355 360 365 

Glu Asp Val Val Pro Tyr Ala Lys Thr Thr Val Pro Gin Glu Met Pro 

370 375 380 

Glu Gin Leu Ser. Val Gly Phe Asn Pro Asn Trp Thr Ser Asp Val Thr 
20385 390 395 400 

Gin Asn Gin Gly Leu Val Gin Trp Leu Val Asn Gly Asn Pro Met Ala 

405 410 415 

Val Asp Leu Glu Val Pro Thr Leu Gin Ser Val Leu Asp Gly Asn Val 
420 425 430 

25Thr Tyr Gly Asn Asn Arg His Val Phe Ala Val Asp Glu Lys His Gin 
435 440 445 

Trp Gin Tyr Trp Val lie Gin Gin Asn Ser Ser Asn Pro Pro Leu Pro 

450 455 460 

His Pro lie His Leu His Gly His Asp Phe Tyr Val Leu Ala Gin Val 
30465 470 475 480 

Glu Asn Ala Val Trp Asn Gly Asp He Ser Thr Leu Lys Thr Asp Asn 

485 490 495 

Pro He Arg Arg Asp Thr Ala Asp Leu Pro Ala Gly Gly Tyr Leu Val 
500 505 510 

3 5Leu Ala Phe Glu Ser Asp Asn Pro Gly Ala Trp Leu Met His Cys His 
515 520 525 

He Pro Phe His Val Ala Ala Gly Leu Gly Val Gin Phe Leu Glu Arg 

530 535 540 

Glu Ser Glu He Lys Ala Gin Asp Gly Tyr Ala Glu Met His Arg Thr 
40545 550 555 560 
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Cys Ala Asn Trp Gin Ser Trp Arg Tyr Lys Tyr His Pro Asn Gly He 

565 570 575 

Leu Phe Pro Gly Asp Ser Gly Leu Arg Arg Arg Asn 
580 585 

5 

<210> 189 
<211> 327 
<212> DMA 

<213> Cochliobolus heterostrophus 

10 

<400> 189 



atggcgcaag 


agaagaagga 


agaacaaccc 


cagcaagacc 


acatccccac 


ctcgccgcag 


60 


aacgaagagg 


aggaacaaag 


caaaggctcc 


ggcggcctct 


tgagcgcaat 


cggagatcca 


120 


gtcggtacgt 


ctccttatcc 


ccccttcctc 


ctccatctct 


caacccacaa 


cctaacccat 


180 


ISctcccaaggc 


aacgtcctca 


acaccgccct 


ccgccccgtc 


ggcgcgccgc 


tcgagaaatt 


240 


cgtcacaggc 


ccgctgggcg 


agggtctcgg 


cggcaccaca 


cgcggcgcgc 


tgggcccgtt 


300 


gatgggccac 


gaggacgagc 


gctctga 








327 



<210> 190 
20<211> 108 
<212> PRT 

<213> Cochliobolus heterostrophus 
<400> 190 

2 5Met Ala Gin Glu Lys Lys Glu Glu Gin Pro Gin Gin Asp His He Pro 

15 10 15 

Thr Ser Pro Gin Asn Glu Glu Glu Glu Gin Ser Lys Gly Ser Gly Gly 

20 25 30 

Leu Leu Ser Ala He Gly Asp Pro Val Gly Thr Ser Pro Tyr Pro Pro 
30 35 40 45 

Phe Leu Leu His Leu Ser Thr His Asn Leu Thr His Leu Pro Arg Gin 

50 55 60 

Arg Pro Gin His Arg Pro Pro Pro Arg Arg Arg Ala Ala Arg Glu He 
65 70 75 80 

3 5Arg His Arg Pro Ala Gly Arg Gly Ser Arg Arg His His Thr Arg Arg 

85 90 95 

Ala Gly Pro Val Asp Gly Pro Arg Gly Arg Ala Leu 
100 105 

40<210> 191 
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<211> 1626 
<212> DNA 

<213> Cochliobolus heterostrophus 



5<400> 191 
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S3 a S3 pt sa a p rr S3 ss 
dddy ddLy dd 




/— r 1 — 1 f-*/~* f~+ 1~* ft S3 sa p 
yccyccy ctctc 


/— f /— f v— -f /—i /— f — \ 4— /™t ft 

gcicgcgciLcg 


+- 4— ft ft ft +" y— 1 /— < 4— 

LdLCgCLCyL 


ft ft ft ft f* a ft ft ss a 

cggyca.gca.d 


a ft ft sa ptp* t~ pi sa a 

dyy cty c uCctct 


ft ft a ft ft ft S3 4— a p 

gcctgygdLctc 


i n 9 n 


taegtcaate 


tgggegtegg 


catccccaca 


gccgcagcag 


cattcgtacc 


cgatggcgtc 


1080 


aaggtgtggc 


tgeaatcega 


aaatggcatt 


ctaggaatgg 


gcccctaccc 


gaeggaagaa 


1140 


2 5gaagtagacg 


cagacattgt 


caacgccggc 


aaagaaaccg 


taaccctcct 


tcccggcgcc 


1200 


tcgacctttg 


acagcgccga 


atcctttggc 


atgatccgcg 


gcggccacgt 


cgacgtatcc 


1260 


atccttggag 


ctctacaagt 


cagtgcctct 


ggcgacctgg 


ccaactacat 


ggtgcccggc 


1320 


aaagtcttca 


agggtatggg 


cggcgccatg 


gatctegtta 


gcaatcccga 


tgctacaaaa 


1380 


gtcgttgtcg 


cgactgaaca 


cgttgctaaa 


gatggatcca 


gcaagattgt 


teaggaatge 


1440 


3 0cagttgccgc 


ttacaggagc 


aaagtgcgtg 


agcactatta 


ttaccgatct 


gtgtgtcttt 


1500 


gaagtaaaca 


ggaagagggg 


gactttgacg 


ctgaeggaga 


cggcgccggg 


ggttagcgtt 


1560 


gaggatgtca 


aggegaagae 


ggatgcgaag 


tttgaagtcg 


cgagtgatct 


caagacgatg 


1620 


gag tag 












1626 



35<210> 192 
<211> 541 
<212> PRT 

<213> Cochliobolus heterostrophus 



40<400> 192 



WO 02/42444 



PCT/US01/43381 



124 

Met Asp Thr Leu Pro Ala Ser Cys Arg Leu Leu Thr Ala Leu Pro Ser 

15 10 15 

Arg Leu Cys Ala Arg Arg Ser Leu Pro Pro Gin Leu Arg Ala Ala Arg 
20 25 30 

5Thr Leu Pro Pro Arg Trp Arg Leu Thr Gly Gin Leu Arg Cys lie Ser 
35 40 45 

Glu Arg Ala Pro Thr lie Asp Arg Ser Lys Ser Lys Leu Phe Lys Asp 

50 55 60 

Ala Asp Glu Ala Val Ala Asp Val Gin Pro Gly Ser Thr Val Leu Ser 
1065 70 75 80 

Ala Gly Phe Gly Leu Cys Gly Val Ala Asp Thr Leu lie Ala Ala Met 

85 90 95 

Lys Lys Arg Gly Pro Glu Ser Leu His Ser Leu Thr Ala Val Ser Asn 
100 105 110 

15Asn Ala Gly lie Glu Asp Val Gly Gly Leu Ala- His Leu Thr Lys Asn 
115 120 125 

Gly Gin Val Lys Lys Leu lie lie Ser Phe Leu Gly Asn Asn Lys Ala 

130 135 140 

Leu Glu Lys Gin Tyr Leu Ser Gly Gly lie Glu He Glu Leu Cys Pro 
20145 150 155 160 

Gin Gly Thr Leu Ala Glu Arg He Arg Ala Gly Gly Ala Gly He Pro 

165 170 175 

Ala Phe Tyr Thr Pro Thr Ala Val Asn Thr Leu Leu Gin Asp Gly Gin 
180 185 190 

2 5 He Pro Ala Lys Phe Asp Lys Glu Gly Lys Ala Val Gly Tyr Gly Gin 
195 200 205 

Lys Arg Glu Val Arg Glu Phe Asn Gly Lys Lys Phe Leu Met Glu Thr 

210 215 220 

Ala Leu Thr Gly Asp Val Ala He He Arg Ala His Lys Ala Asp Glu 
30225 230 235 240 

Ala Gly Asn Cys Val Phe Arg Tyr Thr Thr Lys Ala Phe Gly Pro He 

245 250 255 

Met Ala Lys Ala Ala Arg Leu Thr He Val Glu Ala Glu Glu He Val 
260 265 270 

3 5Pro He Gly Thr Phe Asp Ala Asn Glu Val Asp Leu Pro Gly He Phe 
275 280 285 

Val Asp Arg He Val Pro Ala Thr Ala Pro Lys Asn He Glu He Lys 

290 295 300 

Lys Leu Arg Lys Pro Ala Ala Ser Lys Asp Ala Ser Ser Lys Asn Glu 
40305 310 315 320 
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Ala Ala Glu Arg 

Lys Gin Gly Tyr 
340 

5Ala Ala Phe Val 
355 

Gly lie Leu Gly 
37 0 

Asp lie Val Asn 
10385 

Ser Thr Phe Asp 

Val Asp Val Ser 
420 

15Leu Ala Asn Tyr 
435 

Ala Met Asp Leu 
450 

Thr Glu His Val 
20465 

Gin Leu Pro Leu 

Leu Cys Val Phe 
500 

25Glu Thr Ala Pro 
515 

Ala Lys Phe Glu 
530 



Arg Asp Arg lie 
325 

Tyr Val Asn Leu 

Pro Asp Gly Val 
360 

Met Gly Pro Tyr 
375 

Ala Gly Lys Glu 
390 

Ser Ala Glu Ser 
405 

lie Leu Gly Ala 

Met Val Pro Gly 
440 

Val Ser Asn Pro 
455 

Ala Lys Asp Gly 
470 

Thr Gly Ala Lys 
485 

Glu Val Asn Arg 

Gly Val Ser Val 
520 

Val Ala Ser Asp 
535 



125 

Ala Arg Arg Ala 
330 

Gly Val Gly lie 
345 

Lys Val Trp Leu 

Pro Thr Glu Glu 
380 

Thr Val Thr Leu 
395 

Phe Gly Met lie 
410 

Leu Gin Val Ser 
425 

Lys Val Phe Lys 

Asp Ala Thr Lys 
460 

Ser Ser Lys lie 
475 

Cys Val Ser Thr 
490 

Lys Arg Gly Thr 
505 

Glu Asp Val Lys 

Leu Lys Thr Met 
540 



Ala Lys Glu Leu 
335 

Pro Thr Ala Ala 
350 

Gin Ser Glu Asn 
365 

Glu Val Asp Ala 

Leu Pro Gly Ala 
400 

Arg Gly Gly His 
415 

Ala Ser Gly Asp 
430 

Gly Met Gly Gly 
445 

Val Val Val Ala 

Val Gin Glu Cys 
480 

lie lie Thr Asp 
495 

Leu Thr Leu Thr 
510 

Ala Lys Thr Asp 

525 

Glu 



30<210> 193 
<211> 1131 
<212> DNA 

<213> Cochliobolus heterostrophus 



35<220> 

<221> misc_jf eature 
<222> (1) . . . (1131) 
<22 3> n = any nucleotide 



40<400> 



193 
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atgaacataa agacatggct acccccgaaa acgtctgggg cggctggaat gaaactgaaa 60 
tcaacaatct gcatgctcat cagaagacnt gctaaaccgc gttggaatcg tggcactcag 12 0 

ccgtacaaga ggaagccttg gccaaaacaa agggacatga aatatattcc tggcaaaagc 180 
gagagcgatg gtggtggtgt caactgctgg tctgacagca acggagaccc tgactacgat 24 0 

Sgtcaggaaac tgctagactg gaacggcgat tggctacctg ctccggaatc atggtccgct 3 00 

cgaagaggac atgaagaccg tcaccttggt gcacatgtag aacaatggat gaatggacac 3 60 

tcacaagagt gcaccagatc cgtatactac ccactcagta ctttcagtcc cgaagatgga 42 0 

ccttgcaaag agctggcacc tcgttactgg cttgaggcga aggttgaggg cagtaacttg 480 
agagaatctt ggaagacaat ctctacttcg gacccaaagc cgctggatga tacggacatt 54 0 

lOactatccatc caccttggtg ggaattgtac gaggatgtgg tctattctga ggtgattcac 600 
gaggaaggtc agggtgaaca gcatttcaag cataggagct gttacctgaa cagcctacca 660 
gcgccggagg caagaatcga ccctaccgat gcagagcatc ctaccactca tctgatgctg 72 0 

gcttcggctg cagaaaagct tcaagatcta caacaacgta gggaagctaa ggaacgtcgc 780 
ttgttggcca aacggaatcg cccagtcgcg aattcgatgt ttccaatgca agccatggaa 84 0 

15gatcgtcgcc tacgccctaa gaccaacatg tacattcgtc ctgttcagcc agcagatgtt 900 
gttggcattg gaacaaggat gcaaagtttt caaactaaca atattgacag gcgatttaca 960 
actactacgt tgagcatacc atttacgcaa ccgagtttga tgggcgcact gaagatcaaa 102 0 
tccgccagcg aatcaacact gtcaccagtg caggccttcc atacttggtc gcagtctcaa 1080 
agagcaacga gtccaggacc aatcccggtt atgttaccga aaagattgta g 1131 

20 

<210> 194 
<211> 376 
<212> PRT 

<213> Cochliobolus heterostrophus 

25 

<220> 

<221> SITE 

<222> (1) . . . (376) 

<223> Xaa = any amino acid 

30 

<400> 194 

Met Asn lie Lys Thr Trp Leu Pro Pro Lys Thr Ser Gly Ala Ala Gly 

15 10 15 

Met Lys Leu Lys Ser Thr lie Cys Met Leu lie Arg Arg Xaa Ala Lys 
35 20 25 30 

Pro Arg Trp Asn Arg Gly Thr Gin Pro Tyr Lys Arg Lys Pro Trp Pro 

35 40 45 

Lys Gin Arg Asp Met Lys Tyr lie Pro Gly Lys Ser Glu Ser Asp Gly 
50 55 60 

40Gly Gly Val Asn Cys Trp Ser Asp Ser Asn Gly Asp Pro Asp Tyr Asp 
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65 

Val Arg Lys Leu Leu 
85 

Ser Trp Ser Ala Arg 
5 100 
Val Glu Gin Trp Met 
115 

Tyr Tyr Pro Leu Ser 
130 

lOLeu Ala Pro Arg Tyr 
145 

Arg Glu Ser Trp Lys 
165 

Asp Thr Asp lie Thr 
15 180 
Val Val Tyr Ser Glu 
1*95 

Phe Lys His Arg Ser 
210 

2 0 Arg He Asp Pro Thr 

225 

Ala Ser Ala Ala Glu 
245 

Lys Glu Arg Arg Leu 
25 260 

Met Phe Pro Met Gin 
275 

Asn Met Tyr He Arg 
290 

3 0Thr Arg Met Gin Ser 

305 

Thr Thr Thr Leu Ser 
325 

Leu Lys He Lys Ser 
35 340 

Phe His Thr Trp Ser 
355 

Pro Val Met Leu Pro 
370 

40 



127 

70 

Asp Trp Asn Gly Asp 
90 

Arg Gly His Glu Asp 
105 

Asn Gly His Ser Gin 
120 

Thr Phe Ser Pro Glu 
135 

Trp Leu Glu Ala Lys 
150 

Thr He Ser Thr Ser 
170 

He His Pro Pro Trp 
185 

Val He His Glu Glu 
200 

Cys Tyr Leu Asn Ser 
215 

Asp Ala Glu His Pro 
230 

Lys Leu Gin Asp Leu 
250 

Leu Ala Lys Arg Asn 
265 

Ala Met Glu Asp Arg 
280 

Pro Val Gin Pro Ala 
295 

Phe Gin Thr Asn Asn 
310 

He Pro Phe Thr Gin 
330 

Ala Ser Glu Ser Thr 
345 

Gin Ser Gin Arg Ala 
360 

Lys Arg Leu 
375 



75 80 
Trp Leu Pro Ala Pro Glu 
95 

Arg His Leu Gly Ala His 
110 

Glu Cys Thr Arg Ser Val 
125 

Asp Gly Pro Cys Lys Glu 
140 

Val Glu Gly Ser Asn Leu 
155 160 
Asp Pro Lys Pro Leu Asp 
175 

Trp Glu Leu Tyr Glu Asp 
190 

Gly Gin Gly Glu Gin His 
205 

Leu Pro Ala Pro Glu Ala 
220 

Thr Thr His Leu Met Leu 
235 240 
Gin Gin Arg Arg Glu Ala 
255 

Arg Pro Val Ala Asn Ser 
270 

Arg Leu Arg Pro Lys Thr 
285 

Asp Val Val Gly lie Gly 
300 

He Asp Arg Arg Phe Thr 
315 320 
Pro Ser Leu Met Gly Ala 
335 

Leu Ser Pro Val Gin Ala 
350 

Thr Ser Pro Gly Pro He 
365 
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<210> 195 
<211> 768 
<212> DNA 

<213> Cochliobolus heterostrophus 

5 

<400> 195 

atggagaaca tggagatatc ccagcaaatc aaatccacga cattgtctgt tccctcgccg 6 0 

accgcgacac atactgcctg tgtcaacggt gcacgtttgc aaatccgatg tctcaatact 12 0 

ttcgaggtgg tccgtactat cgccctccca tccacccatg atttgcgctc gtcgaagatt 180 

lOacctggtcac ccctggtcat tccgcccttg acctcatcaa cacgcacatc ttcgcccacc 240 
actacacctc cccgtcgatc atcacggaca ccacgtccct gctcgaatcg cgtcctcata 3 00 

tccgacgacg acaccgcgcg cgtttacgat ctccgcgatg agaaatggaa tgccgtgatt 3 60 

agcaatggct ctggtggcat ggggaagaat gttcacgtcg agtttggagg aacagaggac 42 0 

gaggtgcttg tttggaccga ctttaccgcc tgtgttaaga tatggtgctt gaagacgggt 480 

15cgggtagtgg agatacgcga tccgaagttt cctggtaaag atggcaaggg gtggggttac 540 
cgacctgctg acgatactgg attgaggaat ggaaggggac aagggcgtgt tctggcatta 60 0 

ttgtgtcgtg catcagggac cgatatcttg ttgcttcttg caccgcagac gtacaaggtt 660 
ctgaatcgag tcgaactccc tactacagac gccgctggtc tgagatggag tcgtgacggg 72 0 

cgctggctgg ccatctggga cgctgcgtct gcgggttaca agctttga 7 68 

20 

<210> 196 
<211> 255 
<212> PRT 

<213> Cochliobolus heterostrophus 

25 

<400> 196 

Met Glu Asn Met Glu lie Ser Gin Gin He Lys Ser Thr Thr Leu Ser 

15 10 15 

Val Pro Ser Pro Thr Ala Thr His Thr Ala Cys Val Asn Gly Ala Arg 
30 20 25 30 

Leu Gin He Arg Cys Leu Asn Thr Phe Glu Val Val Arg Thr He Ala 

35 40 45 

Leu Pro Ser Thr His Asp Leu Arg Ser Ser Lys He Thr Trp Ser Pro 
50 55 60 

35Leu Val He Pro Pro Leu Thr Ser Ser Thr Arg Thr Ser Ser Pro Thr 
65 70 75 80 

Thr Thr Pro Pro Arg Arg Ser Ser Arg Thr Pro Arg Pro Cys Ser Asn 

85 90 95 

Arg Val Leu He Ser Asp Asp Asp Thr Ala Arg Val Tyr Asp Leu Arg 
40 100 105 HO 



WO 02/42444 



PCT/US01/43381 



129 

Asp Glu Lys Trp Asn Ala Val lie Ser Asn Gly Ser Gly Gly Met Gly 

115 120 125 

Lys Asn Val His Val Glu Phe Gly Gly Thr Glu Asp Glu Val Leu Val 
130 135 140 

5Trp Thr Asp Phe Thr Ala Cys Val Lys lie Trp Cys Leu Lys Thr Gly 
145 150 155 160 

Arg Val Val Glu lie Arg Asp Pro Lys Phe Pro Gly Lys Asp Gly Lys ■ 

165 170 175 

Gly Trp Gly Tyr Arg Pro Ala Asp Asp Thr Gly Leu Arg Asn Gly Arg 
10 180 185 190 

Gly Gin Gly Arg Val Leu Ala Leu Leu Cys Arg Ala Ser Gly Thr Asp 

195 200 205 

He Leu Leu Leu Leu Ala Pro Gin Thr Tyr Lys Val Leu Asn Arg Val 
210 215 220 

15Glu Leu Pro Thr Thr Asp Ala Ala Gly Leu Arg Trp Ser Arg Asp Gly 
225 230 235 240 

Arg Trp Leu Ala He Trp Asp Ala Ala Ser Ala Gly Tyr Lys Leu 



20<210> 197 
<211> 723 
<212> DMA 

<213> Cochliobolus heterostrophus 
25<400> 197 

atggacggtg gatgttgcgt agtggccgat gaattcgccg atgaagatga cgtcgagtgg 60 

gagcgctgtg aagccgtgta caagtacacg actggttcgt catgcgcagt ttggataacg 12 0 

agacgattgg ggtcgcaggg gtgccagagg agtgctttca caggagcgta cattatgagg 18 0 

atcgagcggg gacggagact tcgcaggtcc caaatccaga ctgtgcaggg tgtgctgtcg 24 0 

30tctctgcttg cgcacattgt gccttcggag ttgaagctga gcatgccgat gccttgtttt 300 

aggagcgcat tttcgttctt ttctagggcg gctttgggag gtgtggcggg ttgtggtgtg 360 

agtgtgaaac tacgggcgcc caggttgtcg acttgctctg tgtacactgg tgcgctgggt 42 0 

acgtcgatga cgggtgtgtg gtcgaggaac aggatgggtg cgaatgtgcg tgtagaaagg 480 

atgcgaacac gacggtccca gccgccaact gcgagacgtt catgtccagg gacccattct 54 0 

3 5agactcttga tgcctaggcc ttctacatcc cattcgctga cgtcctcgga tgcttcgcgg 600 

gttatggtgc ggtacaaatg cccatccgcc gtatatatca aagcttgtaa cccgcagacg 660 

cagcgtccca gatggccagc cagcgcccgt cacgactcca tctcagacca gcggcgtctg 720 

tag 723 



40<210> 198 
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<211> 240 
<212> PRT 

<213> Cochliobolus heterostrophus 



5<400> 198 

Met Asp Gly Gly Cys Cys Val Val Ala Asp Glu Phe Ala Asp Glu Asp 

15 10 15 

Asp Val Glu Trp Glu Arg Cys Glu Ala Val Tyr Lys Tyr Thr Thr Gly 
20 25 30 

lOSer Ser Cys Ala Val Trp lie Thr Arg Arg Leu Gly Ser Gin Gly Cys 
35 40 45 

Gin Arg Ser Ala Phe Thr Gly Ala Tyr lie Met Arg He Glu Arg Gly 

50 55 60 

Arg Arg Leu Arg Arg Ser Gin He Gin Thr Val Gin Gly Val Leu Ser 
1565 70 75 80 

Ser Leu Leu Ala His He Val Pro Ser Glu Leu Lys Leu Ser Met Pro 

85 90 95 

Met Pro Cys Phe Arg Ser Ala Phe Ser Phe Phe Ser Arg Ala Ala Leu 
100 105 110 

2 0Gly Gly Val Ala Gly Cys Gly Val Ser Val Lys Leu Arg Ala Pro Arg 

115 120 125 

Leu Ser Thr Cys Ser Val Tyr Thr Gly Ala Leu Gly Thr Ser Met Thr 

130 135 140 

Gly Val Trp Ser Arg Asn Arg Met Gly Ala Asn Val Arg Val Glu Arg 
25145 150 155 160 

Met Arg Thr Arg Arg Ser Gin Pro Pro Thr Ala Arg Arg Ser Cys Pro 

165 170 175 

Gly Thr His Ser Arg Leu Leu Met Pro Arg Pro Ser Thr Ser His Ser 
180 185 190 

3 0Leu Thr Ser Ser Asp Ala Ser Arg Val Met Val Arg Tyr Lys Cys Pro 

195 200 205 

Ser Ala Val Tyr He Lys Ala Cys Asn Pro Gin Thr Gin Arg Pro Arg 

210 215 220 

Trp Pro Ala Ser Ala Arg His Asp Ser He Ser Asp Gin Arg Arg Leu 
35225 230 235 240 



<210> 199 
<211> 1647 
<212> DNA 
4 0<213> Cochliobolus heterostrophus 
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<400> 199 



atgaacgtca 


agcaagcggc 


atgtctgaat 


tgccgcaaaa 


gcaagataaa 


atgccggcgc 


60 


gaagaaggcg 


cttctgtgtg 


tgaaagatgc 


tctagcgtag 


gcgtcgaatg 


cattataccc 


12 0 


gagttccata 


ttggtaggca 


aaagggcgtg 


aaaaacaaac 


gatcagggtt 


ggagaaagca 


180 


5atctaccaag 


tagaagaagc 


aatcaagaag 


agaaaatcag 


acgtagctgt 


caaccagagc 


240 


acgttacagc 


atttgcaaca 


gcttttgaac 


gaagcacaag 


gagacgttgg 


ccctagtcaa 


300 


gatgcaaaat 


caccgccagt 


actagcagaa 


ctatcttatg 


tgccagcaaa 


agaagttgcc 


3 60 


agcacttcaa 


gcgatgatca 


gcttgccgtt 


gaagatgtcg 


agaatccgct 


tcagctttta 


420 


gcccgcgcat 


cagacttgag 


gattgccacc 


accccacagt 


cgtacaatac 


aagtgtcgcc 


4 80 


lOagcccagaag 


gcaggtttac 


tggtagcgag 


caaagcgcat 


tcctcgatgt 


tcatcacttc 


540 


ttcttaccaa 


tgaaggcgca 


tttggaccaa 


ggatctgggt 


tagatccaat 


tgatgtagga 


60 0 


ttggttacca 


aagatgaagc 


ggagatgctc 


ctccaatatt 


tccacaaaag 


actagctcac 


660 


acgcgctggg 


gtctagaccc 


agtggtgcat 


actctacctt 


ttgtccgaaa 


ccgctcagcc 


72 0 


tttctgttta 


cgacattgct 


ggctgtgacg 


gccgtcttcc 


taccagaaac 


gtctgctttg 


780 


15gccaaaagac 


tacttcttca 


ccgcaggttt 


ctagctgaac 


aggtcattgt 


tcgaaagtac 


840 


agatccgttg 


aaatcgtcct 


ggcattcatg 


gtgagcatac 


catggatgcc 


cccagggtcg 


900 


catgcaagcg 


acgacgacac 


aagtctctat 


ctagctacgg 


cattgtctat 


ttctttggat 


960 


cttatgctag 


acaaagtcat 


cactccatct 


acgtcctttg gtccggagct 


cacgaggcag 


102 0 


atgcccaaag 


cagagtgtct 


tgacgcaaga 


aaagcactag 


ctatggatgg 


tttcgaggac 


1080 


2 0attgacccga 


cttctgaatg 


gggccagcga 


ctgcttcgtc 


ggagagaaag 


ggtctggatt 


1140 


gcgctgtttg 


tgctagagcg 


tggcgtgtgc 


ctcgctcgtg 


gccgcagcta 


ctgtgtacca 


12 00 


aagacgtgct 


tgattcaata 


cagcgataaa 


tggcatgacc 


accagcactc 


ggatgcccag 


12 60 


gacggtccgc 


tagtatccat 


ggcagtatta 


cgtcgcgatc 


tcgacaacct 


ttttgccgaa 


1320 


gtacgcacgc 


gatgcgacaa 


ctatggctcg 


gccgaagtag 


gttcccaggt 


tgcgcaggaa 


1380 


25atcgacaagt 


caattgaggg 


cttcttcgac 


aattggtctc 


gggcatggcc 


ttcagttata 


1440 


agtgacccag 


agagcaagag 


cctaccccct 


tatgtcgaga 


tactcgttac 


acacacacga 


1500 


ctctcgacct 


actcaatgct 


tctgaaccat 


ccgagcgcgc 


caccagaagt 


caagcgctcg 


1560 


ttccgcaagt 


ctgcgttatc 


ctcggcgctc 


aatgttatgc 


gccgcagcaa 


tccaaggcga 


1620 


gggacctctc 


aagtcaatgc 


ccaataa 








1647 



30 



<210> 200 
<211> 548 
<212> PRT 

<213> Cochliobolus heterostrophus 

35 

<400> 200 

Met Asn Val Lys Gin Ala Ala Cys Leu Asn Cys Arg Lys Ser Lys lie 

1 5 10 15 

Lys Cys Arg Arg Glu Glu Gly Ala Ser Val Cys Glu Arg Cys Ser Ser 
40 20 25 30 
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Val Gly Val Glu 
35 

Gly Val Lys Asn 
50 

5Glu Glu Ala He 
65 

Thr Leu Gin His 

Gly Pro Ser Gin 
10 100 
Tyr Val Pro Ala 
115 

Ala Val Glu Asp 
130 

15 Asp Leu Arg He 
145 

Ser Pro Glu Gly 

Val His His Phe 
20 180 
Gly Leu Asp Pro 
195 

Met Leu Leu Gin 
210 

2 5Leu Asp Pro Val 
225 

Phe Leu Phe Thr 

Thr Ser Ala Leu 
30 260 
Glu Gin Val He 
275 

Phe Met Val Ser 
290 

3 5 Asp Asp Thr Ser 
305 

Leu Met Leu Asp 

Leu Thr Arg Gin 
40 340 



Cys He He Pro 
40 

Lys Arg Ser Gly 
55 

Lys Lys Arg Lys 
70 

Leu Gin Gin Leu 
85 

Asp Ala Lys Ser 

Lys Glu Val Ala 
120 

Val Glu Asn Pro 
135 

Ala Thr Thr Pro 
150 

Arg Phe Thr Gly 
165 

Phe Leu Pro Met 

He Asp Val Gly 
200 

Tyr Phe His Lys 
215 

Val His Thr Leu 
230 

Thr Leu Leu Ala 
245 

Ala Lys Arg Leu 

Val Arg Lys Tyr 
280 

He Pro Trp Met 
295 

Leu Tyr Leu Ala 
310 

Lys Val He Thr 
325 

Met Pro Lys Ala 



132 

Glu Phe His He 

Leu Glu Lys Ala 
60 

Ser Asp Val Ala 
75 

Leu Asn Glu Ala 
90 

Pro Pro Val Leu 
105 

Ser Thr Ser Ser 

Leu Gin Leu Leu 
140 

Gin Ser Tyr Asn 
155 

Ser Glu Gin Ser 
170 

Lys Ala His Leu 
185 

Leu Val Thr Lys 

Arg Leu Ala His 
220 

Pro Phe Val Arg 
235 

Val Thr Ala Val 
250 

Leu Leu His Arg 
265 

Arg Ser Val Glu 

Pro Pro Gly Ser 
300 

Thr Ala Leu Ser 
315 

Pro Ser Thr Ser 
330 

Glu Cys Leu Asp 
345 



Gly Arg Gin Lys 
45 

He Tyr Gin Val 

Val Asn Gin Ser 
80 

Gin Gly Asp Val 
95 

Ala Glu Leu Ser 
110 

Asp Asp Gin Leu 
125 

Ala Arg Ala Ser 

Thr Ser Val Ala 
160 

Ala Phe Leu Asp 
175 

Asp Gin Gly Ser 
190 

Asp Glu Ala Glu 
205 

Thr Arg Trp Gly 

Asn Arg* Ser Ala 
240 

Phe Leu Pro Glu 
255 

Arg Phe Leu Ala 
270 

He Val Leu Ala 
285 

His Ala Ser Asp 

He Ser Leu Asp 
320 

Phe Gly Pro Glu 
335 

Ala Arg Lys Ala 
350 
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Leu Ala Met Asp Gly Phe Glu Asp lie Asp Pro Thr Ser Glu Trp Gly 

355 360 365 

Gin Arg Leu Leu Arg Arg Arg Glu Arg Val Trp lie Ala Leu Phe Val 
370 375 380 

5Leu Glu Arg Gly Val Cys Leu Ala Arg Gly Arg Ser Tyr Cys Val Pro 
385 390 395 400 

Lys Thr Cys Leu lie Gin Tyr Ser Asp Lys Trp His Asp His Gin His 

405 410 415 

Ser Asp Ala Gin Asp Gly Pro Leu Val Ser Met Ala Val Leu Arg Arg 
10 420 425 430 

Asp Leu Asp Asn Leu Phe Ala Glu Val Arg Thr Arg Cys Asp Asn Tyr 

435 440 445 

Gly Ser Ala Glu Val Gly Ser Gin Val Ala Gin Glu lie Asp Lys Ser 
450 455 460 

15Ile Glu Gly Phe Phe Asp Asn Trp Ser Arg Ala Trp Pro Ser Val lie 
465 470 475 480 

Ser Asp Pro Glu Ser Lys Ser Leu Pro Pro Tyr Val Glu lie Leu Val 

485 490 495 

Thr His Thr Arg Leu Ser Thr Tyr Ser Met Leu Leu Asn His Pro Ser 
20 500 505 510 

Ala Pro Pro Glu Val Lys Arg Ser Phe Arg Lys Ser Ala Leu Ser Ser 

515 520 525 

Ala Leu Asn Val Met Arg Arg Ser Asn Pro Arg Arg Gly Thr Ser Gin 
530 535 540 

2 5 Val Asn Ala Gin 
545 



<210> 201 
<211> 2271 
30<212> DNA 

<213> Cochliobolus heterostrophus 



<400> 201 

atggcggacg cagagcagac aatcaacctc aaggtccttt cgccttcagc ggaactagag 60 

35ggcggcatca ccctcgcggg cctacccgct tctatcacgg tcaaagagct ccgcacccgc 12 0 

atacacgatg ctgtgccctc caagcctgcc cccgagcgca tgcgcctcat atacagaggc 180 

cgagtggtag cgaatgatgc agacactctg actaccgtgt ttggcgctga caatatacgt 240 

gagaacaaga accaaagcct tcacctcgtc atacgagagc tgcctccaac tgcatcttcg 3 00 

cctgtcccgc aatcgtcttc tgtcccacca aacctcttcc gctctgctgg tccagatggc 360 

4 0ccagccgcga gccctctgca gacgaatcca tttcgggcta taccacagac acgaccggct 42 0 
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tcacaacctc 


aaatacccca 


gtcgcaccfct 


ccgcctcatc 


gccttccggg 


acaagtgaac 


a q r\ 
4 o U 


cccattccca 


taccattacc 


cgcacaactc 


catcaaacgt 


ttgctcaagc 


aatggcacac 


b 4 U 


caaggacaac 


agggtgatga 


acagccctca 


gatcgaacta 


gcgagcagcc 


agate aaggt 


o (J U 


acaccggcag 


cgggggatag 


gacgcataca 


ccaatccctt 


caggaccgtc 


gaaccctcct 


c a r\ 
bbU 


5ggaaatggcg 


accaggcgat 


caggcgagaa 


ggtgttgcgc 


ctaatggagc 


acgatggaca 


H O A 


gttacggcct 


tcaatccact 


taacatagct 


gcgcgactcc 


cgccgcctgt 


cgt cacattc 


■-7 o n 


cctgtcccgc 


atgcactaac 


tttcggtcgt 


ccgccgcttt 


ctagcgacaa 


ecageggtta 


q a n 
o 4fc U 


ttgcctcgtg 


tgcacaggat 


cttcttggag 


acaaaacggg 


agattgataa 


cattcgagca 


q n n 
yuu 


ttgttgcaac 


tgcctggtgc 


atctgatgca 


cagagtggag 


ggctcctcac 


ct cagatata 


y b u 


lOcctgcctcgt 


tgaatatccc 


tgtatggcga 


atcgagcgac 


tacgtcagca 


cctgaacaca 




gtcaatcaaa 


atctggatgt 


cgttgaccgg 


gctctggcgt 


tgcttcct ac 


agagectgaa 


luoU 


gtgacggcgc 


tcaggcgctc 


agctaccgag 


ttgagggttg 


atgctgcgga 


attgagtatt 


T 1 /i n 
114 U 


gtgctcgatc 


gtcaacaggg 


cgaaacggcc 


agggctactt 


cggatacagc 


accaggggtg 


I o n a 
J_Z U U 


cccaccatag 


ctgcggcatc 


atcaactaca 


tcccagaccc 


gaccaggaga 


tgtgacacag 


t n £T n 


15actgtaccga 


cagatgcacc 


tgcagagctg 


ttcctuttgt 


caagtcccca 


gggtccggta 


1 *a o n 


ggagttctct 


tcgatcagcg 


aggcacatac 


accacagccc 


caatggtgcc 


cactct acca 


loon 


ttccagagct 


tctcgagtca 


atttgcacag 


aacagacagc 


tcattgctgg 


tc ttgggcag 


1 a a r\ 


caaatggcac 


aggggacaaa 


ccacctgcat 


aatcaagtat 


ctaacatgca 


gccaacacca 




atagggcagc 


cagtagctgt 


fcggacaggcfc 


caagatcata 


accgaggata 


tgat cagaat 


IDOU 


2 Ocagaat caga 


atcagaatca 


aaaccagaac 


cagaatgata 


atcagaatgg 


agtgcagcca 


162U 


gaagaaaatg 


atcggatggc 


caatatcgcc 


ggacatttgt 


ggc tgatctt 


caagctcgct 


i a o n 


gtcttcgtct 


acgtcttcgc 


tggaggtggt 


ggtatttaca 


ggcctgtaat 


getaggtget 


T H a n 


attgctggga 


ttgtctatct 


ggcacagatc 


ggcatgtttg 


aggaccaga u 


caacuaegt-g 


i a n n 


cgtcgccatt 


ttgaggctct 


4- 4-4- ~ 4- 4- 

tcttcctgtt 


ggcgctatgg 


ccgaacgcgc 


tgcacaaccc 


lODU 


2 5atcaaccagc 


gcccacgagg 


taacatatcg 


cccgaggaag 


cagcaaggcg 


aatac uacaa 


JL J? Z U 


caaagacaag 


aacaaaggtt 


cgcctggtta 


cgcgagagct 


tgcgtggagt 


cgagcgcgct 


1980 


ttcactctct 


tcattgccag 


tctattccct 


ggtgtaggcg 


agagaatggt 


tcacgcacag 


2040 


gaagagagag 


agagactgga 


gagggtagca 


gcacgggaag 


agagagagag 


acaggaggag 


2100 


gaagcgagga 


agcgagaaga 


agacgccagg 


gcacagcagc 


aacagcagac 


cgatgagaaa 


2160 


3 0gctagtgaag 


ccagggttga 


gatggacagt 


gaggttactc 


caagcagcag 


ttcaaagggc 


2220 


aaggagaggg 


ctgaggagca 


acacgttgafc 


gggtcagcct 


catcttcatg 


a 


2271 



<210> 202 
<211> 756 
35<212> PRT 

<213> Cochliobolus heterostrophus 



<400> 202 

Met Ala Asp Ala Glu Gin Thr lie Asn Leu Lys 
40 1 5 10 



Val Leu Ser Pro Ser 
15 
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Ala Glu Leu Glu 
20 

Thr Val Lys Glu 
35 

5 Pro Ala Pro Glu 
50 

Asn Asp Ala Asp 
65 

Glu Asn Lys Asn 

10 

Thr Ala Ser Ser 
100 

Phe Arg Ser Ala 
115 

15 Asn Pro Phe Arg 
130 

lie Pro Gin Ser 
145 

Pro lie Pro lie 

20 

Ala Met Ala His 
180 

Thr Ser Glu Gin 
195 

2 5His Thr Pro He 
210 

Gin Ala He Arg 
225 

Val Thr Ala Phe 

30 

Val Val Thr Phe 
260 

Leu Ser Ser Asp 
275 

3 5 Leu Glu Thr Lys 
290 

Pro Gly Ala Ser 
305 

Pro Ala Ser Leu 

40 



Gly Gly He Thr 

Leu Arg Thr Arg 
40 

Arg Met Arg Leu 
55 

Thr Leu Thr Thr 
70 

Gin Ser Leu His 
85 

Pro Val Pro Gin 

Gly Pro Asp Gly 
120 

Ala He Pro Gin 
135 

His Leu Pro Pro 
150 

Pro Leu Pro Ala 
165 

Gin Gly Gin Gin 

Pro Asp Gin Gly 
200 

Pro Ser Gly Pro 
215 

Arg Glu Gly Val 
230 

Asn Pro Leu Asn 
245 

Pro Val Pro His 

Asn Gin Arg Leu 
280 

Arg Glu He Asp 
295 

Asp Ala Gin Ser 
310 

Asn He Pro Val 
325 



135 

Leu Ala Gly Leu 
25 

He His Asp Ala 

He Tyr Arg Gly 
60 

Val Phe Gly Ala 
75 

Leu Val He Arg 
90 

Ser Ser Ser Val 
105 

Pro Ala Ala Ser 

Thr Arg Pro Ala 
140 

His Arg Leu Pro 
155 

Gin Leu His Gin 
170 

Gly Asp Glu Gin 
185 

Thr Pro Ala Ala 

Ser Asn Pro Pro 
220 

Ala Pro Asn Gly 
235 

He Ala Ala Arg 
250 

Ala Leu Thr Phe 
265 

Leu Pro Arg Val 

Asn He Arg Ala 
300 

Gly Gly Leu Leu 
315 

Trp Arg He Glu 
330 



Pro Ala Ser He 
30 

Val Pro Ser Lys 
45 

Arg Val Val Ala 

Asp Asn He Arg 
80 

Glu Leu Pro Pro 
95 

Pro Pro Asn Leu 
110 

Pro Leu Gin Thr 
125 

Ser Gin Pro Gin 

Gly Gin Val Asn 
160 

Thr Phe Ala Gin 
175 

Pro Ser Asp Arg 
190 

Gly Asp Arg Thr 
205 

Gly Asn Gly Asp 

Ala Arg Trp Thr 
240 

Leu Pro Pro Pro 
255 

Gly Arg Pro Pro 
270 

His Arg He Phe 
285 

Leu Leu Gin Leu 

Thr Ser Asp He 
320 

Arg Leu Arg Gin 
335 
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His Leu Asn Thr Val Asn Gin Asn Leu Asp Val Val Asp Arg Ala Leu 

340 345 350 

Ala Leu Leu Pro Thr Glu Pro Glu Val Thr Ala Leu Arg Arg Ser Ala 
355 360 365 

5Thr Glu Leu Arg Val Asp Ala Ala Glu Leu Ser lie Val Leu Asp Arg 
370 375 380 

Gin Gin Gly Glu Thr Ala Arg Ala Thr Ser Asp Thr Ala Pro Gly Val 
385 390 395 400 

Pro Thr He Ala Ala Ala Ser Ser Thr Thr Ser Gin Thr Arg Pro Gly 
10 405 410 415 

Asp Val Thr Gin Thr Val Pro Thr Asp Ala Pro Ala Glu Leu Phe Leu 

420 425 430 

Leu Ser Ser Pro Gin Gly Pro Val Gly Val Leu Phe Asp Gin Arg Gly 
435 440 445 

15Thr Tyr Thr Thr Ala Pro Met Val Pro Thr Leu Pro Phe Gin Ser Phe 
450 455 460 

Ser Ser Gin Phe Ala Gin Asn Arg Gin Leu He Ala Gly Leu Gly Gin 
465 470 475 480 

Gin Met Ala Gin Gly Thr Asn His Leu His Asn Gin Val Ser Asn Met 
20 485 490 495 

Gin Pro Thr Pro He Gly Gin Pro Val Ala Val Gly Gin Ala Gin Asp 

500 505 510 

His Asn Arg Gly Tyr Asp Gin Asn Gin Asn Gin Asn Gin Asn Gin Asn 
515 520 525 

2 5Gln Asn Gin Asn Asp Asn Gin Asn Gly Val Gin Pro Glu Glu Asn Asp 

530 535 540 

Arg Met Ala Asn He Ala Gly His Leu Trp Leu He Phe Lys Leu Ala 
545 550 555 560 

Val Phe Val Tyr Val Phe Ala Gly Gly Gly Gly He Tyr Arg Pro Val 
30 565 570 575 

Met Leu Gly Ala He Ala Gly He Val Tyr Leu Ala Gin He Gly Met 

580 585 590 

Phe Glu Asp Gin He Asn Tyr Val Arg Arg His Phe Glu Ala Leu Leu 
595 600 605 

3 5Pro Val Gly Ala Met Ala Glu Arg Ala Ala Gin Pro He Asn Gin Arg 

610 615 620 

Pro Arg Gly Asn He Ser Pro Glu Glu Ala Ala Arg Arg He Leu Gin 
625 630 635 640 

Gin Arg Gin Glu Gin Arg Phe Ala Trp Leu Arg Glu Ser Leu Arg Gly 
40 645 650 655 



WO 02/42444 



PCT/US01/43381 



137 

Val Glu Arg Ala 
660 

Gly Glu Arg Met 
675 

5Val Ala Ala Arg 
690 

Arg Glu Glu Asp 
705 

Ala Ser Glu Ala 

10 

Ser Ser Lys Gly 
740 

Ala Ser Ser Ser 
755 

15 

<210> 203 
<211> 489 
<212> DNA 
<213> Cochliobo: 

20 

<400> 203 



atggcgctaa 


tccctcccca 


gtgggtacgt 


catgggttag 


ccgctcatgt 


ggattttccc 


60 


acaccaccca 


acgccttcgc 


cgtcatcttc 


ccgctttgcc 


attgcgaatc 


tcctcatctt 


12 0 


gatgtggccg 


agaaaacaac 


ggtagaaatc 


gcaaatgcat 


gcatcataac 


atggcgactc 


180 


25cctgttcgcg 


ccagctcatt 


cgagcttcta 


tccgaccgcg 


atgcaatata 


cccacacacc 


240 


cacatgcgcg 


cctctctgcc 


ccatctcgca 


ccttcacgtc 


gactcgagcg 


tcatatagcg 


300 


aacaaaattt 


tcacaggaag 


gagtctttta 


ggtcgcggct 


caattccgca 


ctcaagaata 


360 


ccaaagtcaa 


atgggagcca 


ataccaattg 


cactcggtat 


tggcttcctg 


ggtgcatttc 


420 


agctatatcg 


catacaacgc 


agagaaaagc 


atacagaagc 


cgagagaagg 


gatgcggatg 


480 


3 0gcaatgtag 












489 



<210> 204 
<211> 162 
<212> PRT 

3 5<213> Cochliobolus heterostrophus 

<400> 204 

Met Ala Leu lie Pro Pro Gin Trp Val Arg His Gly Leu Ala Ala His 
15 10 15 

4 0 Val Asp Phe Pro Thr Pro Pro Asn Ala Phe Ala Val lie Phe Pro Leu 



Phe Thr Leu Phe lie Ala Ser Leu Phe Pro Gly Val 

665 670 
Val His Ala Gin Glu Glu Arg Glu Arg Leu Glu Arg 

680 685 
Glu Glu Arg Glu Arg Gin Glu Glu Glu Ala Arg Lys 

695 700 
Ala Arg Ala Gin Gin Gin Gin Gin Thr Asp Glu Lys 
710 715 720 

Arg Val Glu Met Asp Ser Glu Val Thr Pro Ser Ser 
725 730 735 

Lys Glu Arg Ala Glu Glu Gin His Val Asp Gly Ser 
745 750 



lus heterostrophus 
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20 

Cys His Cys Glu Ser 
35 

Glu lie Ala Asn Ala 
5 50 
Ser Ser Phe Glu Leu 
65 

His Met Arg Ala Ser 
85 

lOArg His lie Ala Asn 
100 

Gly Ser lie Pro His 
115 

Gin Leu His Ser Val 
15 130 

Tyr Asn Ala Glu Lys 
145 

Ala Met 



138 

25 

Pro His Leu Asp Val 
40 

Cys lie lie Thr Trp 
55 

Leu Ser Asp Arg Asp 
70 

Leu Pro His Leu Ala 
90 

Lys lie Phe Thr Gly 
105 

Ser Arg lie Pro Lys 
120 

Leu Ala Ser Trp Val 
135 

Ser lie Gin Lys Pro 
150 



30 

Ala Glu Lys Thr Thr Val 
45 

Arg Leu Pro Val Arg Ala 
60 

Ala lie Tyr Pro His Thr 
75 80 
Pro Ser Arg Arg Leu Glu 
95 

Arg Ser Leu. Leu Gly Arg 
110 

Ser Asn Gly Ser Gin Tyr 
125 

His Phe Ser Tyr lie Ala 
140 

Arg Glu Gly Met Arg Met 
155 160 



20 

<210> 205 
<211> 1581 
<212> DNA 

<213> Cochliobolus heterostrophus 

25 

<220> 

<221> misc_f eature 
<222> (1) . . . (1581) 
<223> n = any nucleotide 

30 

<400> 205 

atgcatcata acatggcgac tccctgttcg cgccagctca ttcgagcttc tatccgaccg 60 
cgatgcaata tacccacaca cccacatgcg cgcctctctg ccccatctcg caccttcacg 120 
tcgactcgag cgtcatatag cgaacaaaat tttcacagga aggagtcttt taggtcgcgg 180 

35ctcaattccg cactcaagaa taccaaagtc aaatgggagc caataccaat tgcactcggt 240 
attggcttcc tgggtgcatt tcagctatat cgcatacaac gcagagaaaa gcatacagaa 3 00 

gccgagagaa gggatgcgga tggcaatgta gtggatcagc aaggtcgtcc gaagaagcgc 360 
gaaagaataa gaccgagcgg accatggacc gttcaggtca tgtctaccct tcctctcaag 42 0 

gcgttgtcgc gactgtgggg tcgcttcaat gagatcgaca taccctacta ccttcatcta 48 0 

40catgtatacc ccaacctcgc cgcctttttc taccgcaccc tcaaacccgg tgtacgtcct 540 
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ctagatccca 


accccaacgc 


agtactctct 


cccgcagacg 


gcaagatcat 


tcaatttggc 


60 0 


accatcgagc 


aeggegaagt 


tgagcaagtc 


aaaggtgtaa 


catatagttt 


ggacgctctg 


660 


ctaggatcta 


caaggnccag 


tacaccagag 


caaaatgtag 


caaattccca 


aattcgeget 


720 


agtgagcacg 


agaagacacc 


acaagacgaa 


gaggacactg 


tgcgcgcgga 


tgaggaattt 


7 8 0 


Sgcaaacgtga 


aeggtatetc 


at a tact eta 


ccaaacctct 


tctccggacc 


atggccaaaa 


q a a 


gacgggaagc 


ctgctgaaat 


gecgaeggat 


caatcagttc 


cgtcaaagcc 


ategtcagaa 


a a a 
y U U 


gccgaagtac 


gtgccgacct 


tgccttgagt 


gaatcacagc 


gcccatggtg 


ggcacccgcc 


q c a 
y 6 U 


teat taaaga 


cacctacggfc 


tctctactac 


tgcgttgtat 


atct tgegee 


aggegae tac 


i a o a 
_L V A U 


cacaggttcc 


actcacctgt 


atcatgggtt 


gttgagtege 


gtegtcaett 


tgctggcgag 


n a q a 


iuc u ttatagtg 


tatcgcccta 


cctacaacgc 


actatgee tg 


gt ctctttac 


cctgaacgag 


1 1 A (\ 

JL JL4 U 


cgtgtggttc 


tcctaggaag 


atggcgctgg 


ggu ttctttt 


cctacact cc 


ggteggegea 


1 O A A 


accaacgttg 


gttccattaa 


gatcaacttt 


gatcgegaac 


ttcgcacaaa 


cagcttaaca 




accgacactg 


cggcggaccg 


tgctgcggaa 


gaagccgctg 


cccgtggtga 


geegtattet 


1320 


ggattcgctg 


aggcctccta 


cacgagcgca 


agccgtgtct 


tgggagggta 


cgcactcaag 


1380 


15cgcggcgagg 


aaatgggtgg 


ttttcagttg 


ggcagtacaa 


ttgtcttagt 


etttgaageg 


1440 


ccgaagggca 


ttcgacctag 


tttggacgag 


ggctttagtg 


gtacacgtgg 


cgagagaaaa 


1500 


ggtgggtttc 


actggaatat 


cgaacaaggg 


caaaaagtca 


aggttggcga 


ggcgttgggt 


1560 


tatgttgaag 


aagttcagta 


a 








1581 



20<210> 206 
<211> 526 
<212> PRT 

<213> Cochliobolus heterostrophus 



25<220> 

<221> SITE 

<222> (1) . . . (526) 

<223> Xaa = any amino acid 



30<400> 206 

Met His His Asn Met Ala Thr Pro 

1 5 
Ser He Arg Pro Arg Cys Asn He 
20 

3 5Ser Ala Pro Ser Arg Thr Phe Thr 
35 40 
Gin Asn Phe His Arg Lys Glu Ser 

50 55 
Leu Lys Asn Thr Lys Val Lys Trp 
4065 70 



Cys Ser Arg Gin Leu He Arg Ala 

10 15 
Pro Thr His Pro His Ala Arg Leu 
25 30 
Ser Thr Arg Ala Ser Tyr Ser Glu 
45 

Phe Arg Ser Arg Leu Asn Ser Ala 
60 

Glu Pro He Pro He Ala Leu Gly 
75 80 
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lie Gly Phe Leu 

Lys His Thr Glu 
100 

5Gln Gin Gly Arg 
115 

Trp Thr Val Gin 
130 

Leu Trp Gly Arg 
10145 

His Val Tyr Pro 

Gly Val Arg Pro 
180 

15 Asp Gly Lys lie 
195 

Gin Val Lys Gly 
210 

Arg Xaa Ser Thr 
20225 

Ser Glu His Glu 

Asp Glu Glu Phe 
260 

25Leu Phe Ser Gly 
275 

Thr Asp Gin Ser 
290 

Ala Asp Leu Ala 
30305 

Ser Leu Lys Thr 

Pro Gly Asp Tyr 
340 

3 5 Ser Arg Arg His 
355 

Gin Arg Thr Met 
370 

Leu Gly Arg Trp 
40385 



Gly Ala Phe Gin 
85 

Ala Glu Arg Arg 

Pro Lys Lys Arg 
120 

Val Met Ser Thr 
135 

Phe Asn Glu lie 
150 

Asn Leu Ala Ala 
165 

Leu Asp Pro Asn 

lie Gin Phe Gly 
200 

Val Thr Tyr Ser 
215 

Pro Glu Gin Asn 
230 

Lys Thr Pro Gin 
245 

Ala Asn Val Asn 

Pro Trp Pro Lys 
280 

Val Pro Ser Lys 
295 

Leu Ser Glu Ser 
310 

Pro Thr Val Leu 
325 

His Arg Phe His 

Phe Ala Gly Glu 
360 

Pro Gly Leu Phe 
375 

Arg Trp Gly Phe 
390 



140 

Leu Tyr Arg lie 
90 

Asp Ala Asp Gly 
105 

Glu Arg lie Arg 

Leu Pro Leu Lys 
14 0 

Asp lie Pro Tyr 
155 

Phe Phe Tyr Arg 
170 

Pro Asn Ala Val 
185 

Thr lie Glu His 

Leu Asp Ala Leu 
220 

Val Ala Asn Ser 
235 

Asp Glu Glu Asp 
250 

Gly lie Ser Tyr 
265 

Asp Gly Lys Pro 

Pro Ser Ser Glu 
300 

Gin Arg Pro Trp 
315 

Tyr Tyr Cys Val 
330 

Ser Pro Val Ser 
345 

Leu Tyr Ser Val 

Thr Leu Asn Glu 
380 

Phe Ser Tyr Thr 
395 



Gin Arg Arg Glu 
95 

Asn Val Val Asp 
110 

Pro Ser Gly Pro 
125 

Ala Leu Ser Arg 

Tyr Leu His Leu 
160 

Thr Leu Lys Pro 
175 

Leu Ser Pro Ala 
190 

Gly Glu Val Glu 
205 

Leu Gly Ser Thr 

Gin lie Arg Ala 
240 

Thr Val Arg Ala 
255 

Thr Leu Pro Asn 
270 

Ala Glu Met Pro 
285 

Ala Glu Val Arg 

Trp Ala Pro Ala 
320 

Val Tyr Leu Ala 
335 

Trp Val Val Glu 
350 

Ser Pro Tyr Leu 
365 

Arg Val Val Leu 

Pro Val Gly Ala 
400 
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Thr Asn Val Gly Ser 
405 

Asn Ser Leu Thr Thr 
420 

5Ala Ala Arg Gly Glu 
435 

Ser Ala Ser Arg Val 
450 

Met Gly Gly Phe Gin 
10465 

Pro Lys Gly lie Arg 
485 

Gly Glu Arg Lys Gly 
500 

15Val Lys Val Gly Glu 
515 

<210> 207 
<211> 366 
20<212> DNA 

<213> Cochliobolus heterostrophus 



<400> 207 



atgcccgaga 


ccaattgtcg 


aagaagccct 


caattgactt 


gtcgatttcc 


tgcgcaacct 


60 


25gggaacctac 


ttcggccgag 


ccatagttgt 


cgcatcgcgt 


gcgtacttcg 


gcaaaaaggt 


120 


tgtcgagatc 


gcgacgtaat 


actgccatgg 


atactagcgg 


accgtcctgg 


gcatccgagt 


180 


gctggtggtc 


atgccattta 


tcgctgtatt 


gaatcaagca 


cgtctttggt 


acacagtagc 


240 


tgcggccacg 


agcgaggcac 


acgccacgct 


ctagcacaaa 


cagcgcaatc 


cagacccttt 


300 


ctctccgacg 


aagcagtcgc 


tggccccatt 


cagaagtcgg 


gtcaatgtcc 


tcgaaaccat 


360 


30ccatag 












366 



<210> 208 
<211> 121 
<212> PRT 
35<213> Cochliobolus heterostrophus 



<400> 208 

Met Pro Glu Thr Asn Cys Arg Arg Ser Pro Gin Leu Thr Cys Arg Phe 
15 10 15 

40Pro Ala Gin Pro Gly Asn Leu Leu Arg Pro Ser His Ser Cys Arg lie 



141 

lie Lys lie Asn Phe Asp Arg Glu Leu Arg Thr 

410 415 
Asp Thr Ala Ala Asp Arg Ala Ala Glu Glu Ala 

425 430 
Pro Tyr Ser Gly Phe Ala Glu Ala Ser Tyr Thr 

440 445 
Leu Gly Gly Tyr Ala Leu Lys Arg Gly Glu Glu 

455 460 
Leu Gly Ser Thr lie Val Leu Val Phe Glu Ala 
470 475 480 

Pro Ser Leu Asp Glu Gly Phe Ser Gly Thr Arg 

490 495 
Gly Phe His Trp Asn lie Glu Gin Gly Gin Lys 

505 510 
Ala Leu Gly Tyr Val Glu Glu Val Gin 
520 525 
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20 25 30 

Ala Cys Val Leu Arg Gin Lys Gly Cys Arg Asp Arg Asp Val lie Leu 

35 40 45 

Pro Trp lie Leu Ala Asp Arg Pro Gly His Pro Ser Ala Gly Gly His 
5 50 55 60 

Ala lie Tyr Arg Cys lie Glu Ser Ser Thr Ser Leu Val His Ser Ser 
65 70 75 80 

Cys Gly His Glu Arg Gly Thr Arg His Ala Leu Ala Gin Thr Ala Gin 
85 90 95 

lOSer Arg Pro Phe Leu Ser Asp Glu Ala Val Ala Gly Pro lie Gin Lys 
100 105 110 

Ser Gly Gin Cys Pro Arg Asn His Pro 
115 120 

15<210> 209 
<211> 714 
<212> DNA 

<213> Cochliobolus heterostrophus 



20<400> 209 



atgtgcgcaa 


gcagagacga 


cagcacaccc 


tgcacagtct 


ggatttggga 


cctgcgaagt 


60 


ctccgtcccc 


gctcgatcct 


cataatgtac 


gctcctgtga 


aagcactcct 


ctggcacccc 


120 


tgcgacccca 


atcgtctcgt 


tatccaaact 


gcgcatgacg 


aaccagtcgt 
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